I want to EXTRACT both the text and tbe images of a PDF file using PHP. All the libraries seem to be about reading, and most of the other solutions either only produce text, or only produce images, or is command line based. I'm looking for a complete solution in PHP. Is this possible?
At this point in time, I'm also open to other suggestions, such as perhaps there is a site with an API that you can submit the file to? Or perhaps someone can give instructions on a modern solution using the OpenOffice command line tool, of that's even possible?
What about the Google Docs API? They have an OCR that you might be able to work with.