ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > General Software Discussion

make scan PDF into text searchable PDF

(1/2) > >>

Steven Avery:
Hi Forum!

Trying to do a 20 megabyte scanned book.

Once I did a book with PDF2GO, but the online tools all seem to have limits or glitch out on memory or something.

Willing to buy Shareware if need be.
Wondershare PDFElement will not do that feature in shareware mode and the cost is about $100 depending on license.
Willing to do it but only if there are not good alternatives.

My Acrobat Reader is great for reading PDFs but does not have that feature.
Similar with my Soda PDF Desktop

Your thoughts?

Thanks!



IainB:
Hello Steven Avery.
OCRing (Optical Character Reading) a 20Mb image-scanned book shouldn't present any problems. There exists plenty of technology to do it for $FREE without needing to be scammed by Adobe and others.

I've been very interested in MICR, digital imaging and OCR technologies for years, as they provide an essential primary automated text data capture functionality.

In response to your above query, I'd suggest you check out these for starters:
* PDF-XChange Viewer ($FREE version) - Mini-Review
I did that review riding on the shoulders of various other erstwhile denizens of DCF who had preceded me.

* OCR - comparisons of different software/capability

* Qiqqa - Reference Management System - Mini-Review - a brilliant library management tool, it indexes existing .PDF OCRed documents, and scans and OCRs existing .PDF imaged documents and then indexes them. You can read your library of .PDF documents in Qiqqa.

Hope that helps or is of use.
I probably should update those reviews/notes, because the technology will have improved and what may have been perceived as shortcomings or niggles then will probably have been cleared away by now...

Could you please add to the knowledge base here (in this forum) by noting and cross-referencing whatever you discover whilst trying to meet your PDF OCR and text capture needs?

Thanks.

kunkel321:
My experience with OCRing scanned stuff is that, if there are tables and you want to convert those into Word tables or Excel sheets, then you almost have to have something that uses the ABBYY OCR engine.    Unfortunately, to convert entire PDFs, you need the expensive subscription ware (which I'm not willing to pay for).    I do have the $10 ABBYY Screenshotter.  It only does one page at a time though.

I use PDF-XChange Editor.   I think it might use Tesseract technology.   It's pretty good too.  It messes up tables, but otherwise is good.

Steven Avery:
Thanks!  Good answers. I have used PDF-Xchange Viewer, but not on my current puter and not for OCR.

It will be my first change.

Most of my books are already OCR and searchable, but I am ready to use the tips above.

Steven Avery:
I think PDF Xchange Viewer was replaced by Editor and Editor Plus, although it seems to still be downloadable.
https://www.tracker-software.com/product/pdf-xchange-viewer/download?fileid=445

"The PDF-XChange Viewer has been replaced by the all NEW PDF-XChange Editor which extends the power of the Viewer PRO with many new features, headlining, Direct Content Editing of text based PDF files (Not PDFs created from images or scans).  A PDF-XChange Editor License will directly license the Viewer as well as the included PDF-XChange Lite virtual PDF printer."

Editor has a free version, but it does have restrictions.

Not sure yet what is best.

Navigation

[0] Message Index

[#] Next page

Go to full version