ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > General Software Discussion

OCR/Text Recognition in a .pdf document? How do I do this.

(1/5) > >>

vegas:
I downloaded this old magazine the other day, which was scanned into a .pdf file, but what I didn't understand, I went to search for something within the file and results actually came back.  How the heck is this done? I have tons of old magazines I want to be able to throw away someday (the sooner the better) - but I'd like to be able to keep all the information within them by scanning them to PDF files.  Do I have to buy Adobe Acrobat to do this or is other software used? Has anyone tried doing this with print material so it is completely searchable?  Thanks for any responses.   -vegas

cranioscopical:
...scanned into a .pdf file, but what I didn't understand, I went to search for something within the file and results actually came back...
Do I have to buy Adobe Acrobat to do this or is other software used?
--- End quote ---
Searchable text is certainly an option with Acrobat (not the reader).
I think there's at least one other .pdf 'handler' out there that claims to do the job. Unhelpfully, I can't recall what it is right now. If my brain starts working again (unlikely) I'll let you know the name. Probably others here will fill in the blank, anyway. It's certainly a useful feature.

Try looking here:  http://searchable-text.qarchive.org/

Darwin:
A couple of years late... PDF Converter Professional 3 (and 4 and 5) will allow you to select "Save As" and "Searchable PDF" from the File menu....

I've had very good results doing this on various pdfs that I have downloaded from the internet. Most of them are ancient journal articles (such as Science magazine articles from the 19th century) but some recent pdfs are generated as image files rather than searchable text. PDF Converter handles these as well. Most recently I've done this on a 400+ page PhD dissertation that features pages that are at an angle and have things like lint and other debris that were on the platen glass visible... Worked like a charm.

Edvard:
I work in a copy shop where we just got a new Xerox machine - a WorkCentre 7655.
It will scan to a searchable pdf - pretty amazing stuff.
I'd recommend going to a reputable copy place in your area and asking their prices for scanning.
Our place is charging 50 cents (US) per, but other folks might be charging less.
Other than that, I don't know of any solution besides Acrobat pro.

Darwin:
Other than that, I don't know of any solution besides Acrobat pro.
-Edvard (August 28, 2008, 04:56 PM)
--- End quote ---

Well, there is PDF Converter Professional as noted above! Actually, this also means that Zeon PDFDoc Gold should do it as well as Scansoft/Nuance licence their product from them.

Navigation

[0] Message Index

[#] Next page

Go to full version