I have for years been using "screen-scrapers" of one form or another, and I thus may be familiar with the context - at least - of the sort of problem you are working on.Objectives:
- The objective seems to be to capture an image of a scrolling screenshot of the whole on-screen report, or something.
- Failing that, the objective seems to be to capture an image of a screenshot of each of the 60 or so pages of on-screen report data.
- Post-capture, the objective is then to OCR any alphameric text in the captured image(s). You apparently intend to use the Adobe Acrobat for that, and I presume that the full version does that.
- NirSoft's SysExporter: as @Ath suggested, this would probably be the best approach, as it would grab the actual data rather than an image of it. That doesn't seem to work in your case.
- ScreenshotCaptor: You already tried that and it doesn't seem to work in your case, presumably because the report is in a proprietary output/display format, or something. If it were able to be displayed in a browser, then SC would probably be able to capture the image, and also the $FREE MS OneNote screen clipper - which seems to capture an entire scrollable web page with no hassle.
- Individual page display capture: This seems to be where you are at, at present.
Without knowing more about the application you are trying to capture the data from, it is difficult to imagine what constraints you are operating under, but I would suggest that most data analysis/management applications have some sort of output functionality - e.g., (say) to export the data, or to output to printer as a file in PDF format. I presume you have explored this avenue.
Since you wish to get the data into a form you can independently manipulate it with - e.g., (say) Excel - then taking/exporting the data from the database directly in some way would be the best approach, as it will be error-free, whereas anything else (e.g., image capture and then OCR) would be undesirable as an alternative, simply because it will introduce errors. I presume you have explored this avenue.
However, if you are now stuck with the only option seeming to be the tedious capture of the 60 or so screen report images and then OCRing those, then I suppose that is what you will have to do.
Rather than try to design, build and test an automated image capture process, I would suggest that you do it manually and focus instead on identifying the best (least error-prone) OCR system to use, post-capture. You may find that Acrobat does not fall into that category for your requirements.
Ideally, you will need an OCR system that captures alphameric data in columnar form
- assuming that the on-screen report has columns.
There may be others, but there is only one system that I have come across that:
- (a) captures directly from the screen,
- (b) then immediately OCRs the text and with great accuracy.
- and it can be very
accurate and with few (if any) errors, where the captured image is clear and the characters in the text are unambiguous. That is ABBYY FineReader
So I would suggest that you consider using that as you can capture each screen and OCR it in one step, for all 60 or so screens.
ABBYY FineReader is a professional OCR solution allowing you to convert scanned and photographed
documents as well as PDFs into editable formats.
The special thing about this software is it's efficiency in use - that, in one step, it can capture tabulated on-screen text, OCR it and send it to Clipboard in tabular-spaced format for use in any document, or directly to a Microsoft Excel spreadsheet
(if Excel is already installed).ABBYY FineReader
has been reviewed and discussed elsewhere in the DC Forum. It is frequently available and offered for $FREE as a promotion for other ABBY software. It usually comes as part of the ABBYY FineReader vX.0 Sprint
suite of software bundled with some scanners.
Hope this helps or is of use.