I'm trying to convert a multipage pdf to jpg with ghostscript in php. The command right now looks something like this:
gs -q -dBATCH -sDEVICE=jpeg -dNOPAUSE -dSAFER -dJPEGQ=100 -dGraphicsAlphaBits=4 -dTextAlphaBits=4 -r72 -sOutputFile=- some.pdf
What I want is to find a way to input the pdf as string wich looks something like this: '%PDF-1.4 %���� 1 0 obj <> endobj 2 0 obj <> endobj, etc.' and output all pages to stdout. Providing an actual pdf file to the command works pretty fine, but it returns a single page. If opting for file writing there is an option p%03d.jpg to get all the pages, but I need it to be dumped to temp/memory. From what I understand you need to use pipes to get this to work. I made something with proc_open() but without any success because I don't know how to pass the string to the pipe.
$args = [
'-dBATCH',
'-sDEVICE=jpeg',
'-dNOPAUSE',
'-dSAFER',
'-dJPEGQ=100',
'-dGraphicsAlphaBits=4',
'-dTextAlphaBits=4',
'-r72',
'-sOutputFile=-',
$path . '/some.pdf'// this shouold be passed as string stdin
];
$descr = [
0 => ['pipe', 'r'],
1 => ['pipe', 'w'],
2 => ['pipe','w']
];
$pipes = array();
$args = implode(' ', $args);
$commd = "gs -q $args";
$process = proc_open($commd, $descr, $pipes);
$response = '';
if (is_resource($process)) {
fputs($pipes[0], $pdf);
fclose($pipes[0]);
while ($f = fgets($pipes[1])) {
$response .= $f;
}
fclose($pipes[1]);
fclose($pipes[2]);
proc_close($process);
}
echo '<img src="data:image/png;base64, ' . base64_encode($response) . '" />';
Update: Found the solution for the input. It'a a dash instead of the last argument representing the input file. The multipage output still remains an issue.
You can't render a PDF file in memory using Ghostscript.
Ghostscript only processes PDF files from disk. If you pipe the input from stdin all that happens is that Ghostscript creates a temporary file, stores the PDF in that, and then renders the temporary file. This is because PDF files inherently require the ability to seek randomly within the file.
So in fact by sending the file via stdin you're just moving the creation of the temporary file to being done inside Ghostscript instead of doing it yourself. If you think you are somehow improving performance by doing this, you are mistaken.
If you specify -
(stdout) as the output file then all the output is sent to stdout. If there's more than one page, then both pages are sent to the output (what else could it do ?). Its up to you to figure out where each page ends and split it up.
If you omit the -q
and look at what gets sent to stdout (eg by redirecting it to a file) you will see that the usual Ghostscript boilerplate is sent at the start. If you further omit the -dNOPAUSE
(note you will need to press 'return' for each page and you won't be prompted, so just hammer the key a bit) and then look at the output you will see that each page is separated by
>>showpage, press <return> to continue<<
So you can see that each page is sent, and its up to you to figure out where each JPEG ends.
I'm not sure what else you were expecting to happen, given that you are sending multiple pages of output to stdout.
i have test this code locally and is working for me :
I'am using passthru to direct output answer, and play around output buffering to capture a
<?php
$command = "/path/to/gs -dBATCH -sDEVICE=jpeg -dNOPAUSE -dSAFER -dJPEGQ=100 -dGraphicsAlphaBits=4 -dTextAlphaBits=4 -r72 -sOutputFile=- ./someFile.pdf";
// Capture output on buffer.
ob_start();
// Will automatically output answer of your command
passthru($command);
// you get buffered output.
$response = ob_get_contents();
// Flush buffer.
ob_end_clean();
echo '<img src="data:image/png;base64, ' . base64_encode($response) . '" />';