Hello Steven Avery.
OCRing (Optical Character Reading) a 20Mb image-scanned book shouldn't present any problems. There exists plenty of technology to do it for $FREE without needing to be scammed by Adobe and others.
I've been very interested in MICR, digital imaging and OCR technologies for years, as they provide an essential primary automated text data capture functionality.
In response to your above query, I'd suggest you check out these for starters:
*
PDF-XChange Viewer ($FREE version) - Mini-Review I did that review riding on the shoulders of various other erstwhile denizens of DCF who had preceded me.
*
OCR - comparisons of different software/capability*
Qiqqa - Reference Management System - Mini-Review - a brilliant library management tool, it indexes existing .PDF
OCRed documents, and scans and OCRs existing .PDF
imaged documents and then indexes them. You can read your library of .PDF documents in Qiqqa.
Hope that helps or is of use.
I probably should update those reviews/notes, because the technology will have improved and what may have been perceived as shortcomings or niggles then will probably have been cleared away by now...
Could you please add to the knowledge base here (in this forum) by noting and cross-referencing whatever you discover whilst trying to meet your PDF OCR and text capture needs?
Thanks.