ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

DonationCoder.com Software > Finished Programs

DONE: Check folder and tell me which PDFs are images (non-searchable)

(1/8) > >>

vevola:
I have a huge number of PDFs, and often I perform text searches across my collection. But I just realized: some of these PDFs are images!

I'd like to be able to know which PDFs are images so that I can convert them to searchable text. But there are so many of them in my folder, I'd have to open them one by one and check manually.

Would it be possible to have a little program that would check in certain folders and tell me which PDFs are images (even with a certain degree of certainty)?

TIA!!

skwire:
Could you please attach one of those image PDFs here?  Or send me one in a PM?  Thanks.

vevola:
Here's an example of a PDF image (non-searchable text)

Other example are all the downloadable ebooks and pages from Google Books (either the free versions, or using Google Book Downloader for Greasemonkey from http://book.huhiho.com/).

Thanks!

skwire:
I have some working code so how do you want this to work?  I can make a full GUI for it or I can simply make it recurse through your PDF folder(s) and spit out a text file at the end listing which files had no searchable text.  Your thoughts?

vevola:
Well, I'm a GUI kinda-guy! It would be great to choose which folders I want to search in, and then have a list that I could order in terms of filename or folder so it would be easier to work with...

Navigation

[0] Message Index

[#] Next page

Go to full version