Re: [PHP] SCanning text of PDF documents
- Date: Thu, 15 May 2008 11:40:56 +0200
- From: Frank Arensmeier <frank@xxxxxxxxxxxx>
- Subject: Re: [PHP] SCanning text of PDF documents
A reliable solution depends partly on the pdf document itself. Consider if your pdf document contains roted text or text that spans about several different blocks/pages. My experience with ps2acsii and other ghostscript related tools is that sometimes it works quite well, sometimes the output is rather messy.
The most reliable way of extracting text from a pdf is (I think) a product called PDF TET from PDFlib Gmbh. Yes, it costs some money for a license, but you are able to get almost everything out of the pdf then.
http://www.pdflib.com/products/tet/ Maybe some magic with OpenOffice could do the trick as well? //frank 15 maj 2008 kl. 10.19 skrev Angelo Zanetti:
Hi All. This is a quick question.A client of ours wants a solution that when a PDF document is uploaded thatwe use PHP to scan the documents contents and save it in a DB.I know you can do this with normal text documents using the file commandsand functions. Is it possible with PDF documents? My feeling is NO, but perhaps someone will prove me wrong. Thanks in advance. Angelo Web: http://www.elemental.co.za -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Frank Arensmeier........................................................................ ........................
Webmaster & IT Development NIKE Hydraulics AB Box 1107 631 80 Eskilstuna Sweden phone +46 - (0)16 16 82 34 fax +46 - (0)16 13 93 16 frank@xxxxxxxxxxxx www.nikehydraulics.se........................................................................ ........................
- References:
- [PHP] SCanning text of PDF documents
- From: Angelo Zanetti
- [PHP] SCanning text of PDF documents
- Prev by Date: [PHP] the class as a namespace
- Next by Date: Re: [PHP] Using SVN w/ Zend Studio for Eclipse
- Previous by thread: Re: [PHP] SCanning text of PDF documents
- Next by thread: Re: [PHP] SCanning text of PDF documents
- Index(es):