使用PHP获取PDF文件的高度和宽度

I have a thumbnail creation script of a PDF file using Imagick PHP.

It would create the thumbnail of the first page of the PDF file.

I am able to produce the thumbnail without any problem with fixed height and width

I have to get the height and width of the PDF file's first page and have to calculate the thumbnail's height and width accordingly.

If I have to create the thumbnail from an image, I can use getimagesize function in PHP, but is there any function like that to get the height and width of the first page of the PDF file ?

You can access the first page of any multipage file format that ImageMagick can read by appending [0] to the filename.

This means, that you can ask identify to print out width and height for the first page of a PDF with this command, which you should have no problem to translate into PHP syntax

 identify  -format "width: %W  --  height: %H
"  some.pdf[0]

This will print the values for the first pages MediaBox in the following format:

  width: 345  --  height: 777

The unit of these values is PostScript points (where 72 pt == 1 inch). Of course you are free to modify the command to suite your needs, like giving out only 2 number values, or using the WxH format:

 identify  -format "%W %H
"  some.pdf[0]
 identify  -format "%Wx%H
"  some.pdf[0]

However, be aware of the following facts:

  1. PDF also supports the optional TrimBox, CropBox, ArtBox and BleedBox settings.
  2. The most important of these is the TrimBox, because:
  3. Should the TrimBox be different from the MediaBox (it needs to be the same or smaller and isn't allowed to be bigger!) then PDF viewers and printer drivers are asked to only render the part of the page that's inside that box.
  4. identify will return the MediaBox values only, it does not have support for the other Boxes.
  5. Likewise, convert will use the (potentially bigger) MediaBox size of the PDF page to render the image (and thus its result will seemingly look different from what you see in a PDF viewer).
  6. Luckily, PDFs with TrimBox values which are very much different from the MediaBox values are not very common.
  7. If you need to get access to the value settings for all the Boxes, you should use a different command utility to extract the relevant info: pdfinfo -box -f 1 -l 1 some.pdf | grep -E '(Box:|rot:|size:)'. (Use the Poppler version of pdfinfo if possible...)