0

Please, any ideas on how to extract image from pdf in php?

kehers
  • 3,968
  • 3
  • 27
  • 31
  • I am trying to do the same thing. PDF Images are stored as is, all bytes in tact. I have compiled a list of starting and ending bytes but am missing some @ http://dadruid5.wordpress.com/2014/08/21/ending-and-starting-bytes-for-images/. Any help completing the list would be appreciated. If you see the file formats you need (anyone directed here), just find the magic number and end bytes or stream(with trim). – Andrew Scott Evans Aug 22 '14 at 18:13
  • one more thing. On Linux (CentOS,Fedora,Ubuntu), using poppler utils call (subprocess or command line) pdfimages [-options] – Andrew Scott Evans Aug 22 '14 at 18:15

4 Answers4

3

Take a look at pdfimages. Here is the description from the page:

Pdfimages saves images from a Portable Document Format (PDF) file as Portable Pixmap (PPM), Portable Bitmap (PBM), or JPEG files.

Pdfimages reads the PDF file, scans one or more pages, PDF-file, and writes one PPM, PBM, or JPEG file for each image, image-root-nnn.xxx, where nnn is the image number and xxx is the image type (.ppm, .pbm, .jpg).

NB: pdfimages extracts the raw image data from the PDF file, without performing any additional transforms. Any rotation, clipping, color inversion, etc. done by the PDF content stream is ignored.

Espo
  • 40,548
  • 21
  • 128
  • 157
2

I believe you can use imagemagic as well. You can send it command line arguments and snap a picture given the coordinates you can provide. You will need to install some rpms etc.

lilott8
  • 1,066
  • 1
  • 15
  • 40
1

Check out PDFLib. Their TET product does just that. You can get the images and text out... Only thing it doesn't cover is vector images.

Jason Plank
  • 2,342
  • 4
  • 32
  • 40
Murray
  • 425
  • 3
  • 11
0

If you have an existing PDF File I guess it's pretty impossible to extract an image from there using PHP, maybe you'll have better luck with C: you need to disassemble the binary file, decode/decompress/decompile it and find where the image is stored, then copy it.

It's easier if you just copy'n'paste it.

OverLex
  • 2,441
  • 1
  • 24
  • 27