Home | Blog | Software | Reviews and Features | Forum | Help | Donate | About us
topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • December 06, 2016, 06:21:09 AM
  • Proudly celebrating 10 years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: Is simple PDF creation + Content indexing possible?  (Read 3115 times)

tsaint

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 451
  • Hi from the a*** end of the earth
    • View Profile
    • Read more about this member.
    • Donate to Member
Is simple PDF creation + Content indexing possible?
« on: October 01, 2008, 07:15:03 PM »
I've been looking to scan  and convert to pdf lots of receipts, statements etc and then index their CONTENTS. Innocently, I thought that would be simple, as X1, Copernicus et al say they index pdfs, jpgs etc. and I've a couple of print to pdf softwares (primo, dopdf)
None of the indexers seem to make it clear up front tho that they only index the file names, not the content of jpg/image files. (or have I got that wrong?)

For pdfs, it seems rather trickier eg, if I create a pdf using dopdf using a word doc as source, contents will be indexed. If I create the pdf using dopdf from an image, the contents won't be indexed.
 (Probably if I use acrobat pro to create the pdfs it will work?)
What does get indexed depends on the search software too, as I discovered using both X1 and Copernicus

So, my question is, please: what's a simple, reliable, cheap way to create pdfs from a scanned doc whose CONTENTS are searchable by X1 or Copernicus or GDS

Darwin

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 6,984
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: Is simple PDF creation + Content indexing possible?
« Reply #1 on: October 01, 2008, 11:05:54 PM »
Well, Evernote Pro is able to recognize text in image files. It's not a pdf solution, but it is a workaround. Beyond that, Nuance's PDF Converter Pro 5 creates searchable pdfs. I don't know if the lower cost versions do this as well or not. I'm not aware of other non-Adobe solutions that do this... though I am sure that there must be others. What you're looking for is an application that will create and/or convert pdfs into "searchable" pdfs.
"Some people have a way with words, other people,... oh... have not way" - Steve Martin

Darwin

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 6,984
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: Is simple PDF creation + Content indexing possible?
« Reply #2 on: October 01, 2008, 11:13:37 PM »
BTW, there's an extended discussion of this issue in this thread
"Some people have a way with words, other people,... oh... have not way" - Steve Martin

tsaint

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 451
  • Hi from the a*** end of the earth
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: Is simple PDF creation + Content indexing possible?
« Reply #3 on: October 03, 2008, 10:06:49 PM »
Thanks for your reply Darwin - sorry I took so long to reply, but out of the blue, my mouse just stopped working (all mice that is) and I got distracted.
 I read a few threads, but my question seemed to cross several - eg pdf creation, desktop searching - and as I'd never seen a simple answer (leave aside evernote) to the scan/search question, decided to ask it in a new thread.
Tony

Paul Keith

  • Member
  • Joined in 2008
  • **
  • Posts: 1,982
    • View Profile
    • Donate to Member
Re: Is simple PDF creation + Content indexing possible?
« Reply #4 on: October 04, 2008, 12:02:11 AM »
Edit: Wrong topic

tsaint

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 451
  • Hi from the a*** end of the earth
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: Is simple PDF creation + Content indexing possible?
« Reply #5 on: October 04, 2008, 12:10:50 AM »
Sorry, I see now I should have included "by desktop search engines" in the topic (although that might be inferred from the "indexing" perhaps).
Edit: Wrong topic