Author Topic: Is simple PDF creation + Content indexing possible? (Read 5290 times)

tsaint · « **on:** October 01, 2008, 07:15 PM »

I've been looking to scan and convert to pdf lots of receipts, statements etc and then index their CONTENTS. Innocently, I thought that would be simple, as X1, Copernicus et al say they index pdfs, jpgs etc. and I've a couple of print to pdf softwares (primo, dopdf)
None of the indexers seem to make it clear up front tho that they only index the file names, not the content of jpg/image files. (or have I got that wrong?)

For pdfs, it seems rather trickier eg, if I create a pdf using dopdf using a word doc as source, contents will be indexed. If I create the pdf using dopdf from an image, the contents won't be indexed.
(Probably if I use acrobat pro to create the pdfs it will work?)
What does get indexed depends on the search software too, as I discovered using both X1 and Copernicus

So, my question is, please: what's a simple, reliable, cheap way to create pdfs from a scanned doc whose CONTENTS are searchable by X1 or Copernicus or GDS

Darwin · « **Reply #1 on:** October 01, 2008, 11:05 PM »

Well, Evernote Pro is able to recognize text in image files. It's not a pdf solution, but it is a workaround. Beyond that, Nuance's PDF Converter Pro 5 creates searchable pdfs. I don't know if the lower cost versions do this as well or not. I'm not aware of other non-Adobe solutions that do this... though I am sure that there must be others. What you're looking for is an application that will create and/or convert pdfs into "searchable" pdfs.

Darwin · « **Reply #2 on:** October 01, 2008, 11:13 PM »

BTW, there's an extended discussion of this issue in this thread

tsaint · « **Reply #3 on:** October 03, 2008, 10:06 PM »

Thanks for your reply Darwin - sorry I took so long to reply, but out of the blue, my mouse just stopped working (all mice that is) and I got distracted.
I read a few threads, but my question seemed to cross several - eg pdf creation, desktop searching - and as I'd never seen a simple answer (leave aside evernote) to the scan/search question, decided to ask it in a new thread.
Tony

Paul Keith · « **Reply #4 on:** October 04, 2008, 12:02 AM »

Edit: Wrong topic

tsaint · « **Reply #5 on:** October 04, 2008, 12:10 AM »

Sorry, I see now I should have included "by desktop search engines" in the topic (although that might be inferred from the "indexing" perhaps).

Edit: Wrong topic
-Paul Keith (October 04, 2008, 12:02 AM)

Author Topic: Is simple PDF creation + Content indexing possible? (Read 5290 times)

tsaint

Is simple PDF creation + Content indexing possible?

Darwin

Re: Is simple PDF creation + Content indexing possible?

Darwin

Re: Is simple PDF creation + Content indexing possible?

tsaint

Re: Is simple PDF creation + Content indexing possible?

Paul Keith

Re: Is simple PDF creation + Content indexing possible?

tsaint

Re: Is simple PDF creation + Content indexing possible?