Dear all,
First some background for the idea :I have some PDF files which are damaged. My goal is to OCR what can be repaired (I recently tested "Nuance Power PDF Advanced2". IMHO it can OCR many pdf that have problems that other OCR softwares can't even open. But alas it has still problem with some pdf files.)
I have tried several tools and techniques. The best ones so far being :
- 3-Heights™ PDF Analysis & Repair (they also sell a shell version).
https://www.pdf-tool...pdf-analysis-repair/(The free version can be used here :
https://www.pdf-onli....com/osa/repair.aspx )
The problem is that it doesn't repair all defects properly. ;(
- and a batch script using SumatraPDF and the
printer Bullzip ( see
https://www.donation...opic=42713.msg399623 ).
The problem here is that it takes a lot of time,CPU and memory. For instance a pdf of 100MB uses 16
GB of temporary SSD space in order to produce ("print") finally, after 10 minutes, a 300MB pdf !
Also for several pdf files, the process is done and at the end no pdf file is created ! ;(
So I got this idea :I realize that the nice thing is that I can open most of the pdf (that have errors) with SumatraPDF.
So it would be great if some software could once the pdf openned in SumatraPDF, take a screenshot of each pages in burst mode (one screenshot then turn to the next page, then repeat). Then I could probably make a pdf from the image files and OCR them very fast ?
I did test SCREENSHOT CAPTOR VERSION 4
https://www.donation...hotcaptor/index.html but I wasn't able to do it (the automatic page "down/up" did not work - win8.1 64) !
Thanks in advance