Author Topic: ScreenshotCaptor - Auto Capture modes (Read 3625 times)

viau2555 · « **on:** January 10, 2017, 02:11 PM »

I've been using Screenshot Captor consistently for years now. I think it's a very great piece of software. Though, the help file will need upgrade eventually. Presently I am trying some of the more advanced features like the auto capture capabilities. Cant find any help with my specific needs.

Challenge: I have a Windows scrolling list that I need to capture but it is very long, like some 3000 entries. Obviously I cannot put all entries on a single image. I need to process the information on an OCR software. It is going to be Acrobat. The list content is very simple. It consist of numbers. One single column on 3000 lines. My approach is to capture one PageDown screen for each image. I will end-up with some 60 images (60 files) that will all be converted to a single big 60 pages PDF file. Then Acrobat will OCR the whole thing and I will finally have all my entries available for text search.

I found that Screenshot Captor has a PostCapture capability. Probably useful with my approach but I need some guidance as to how I can program that Post Capture IDE.
Basically, I need to make a loop that will

1- Capture one single Fixed Region
2- Send a PageDown to the window
3- Goto step 1 to repeat the process until I have covered all the available pages. Something like Loop Until 60 iterations (or so).

Look simple indeed. But where can I get the language reference for this PostCapture IDE ? How does it work ?
Where should I start looking ? Where can I get some examples of a loop process ?

Thanks for your help

Jim

mouser · « **Reply #1 on:** January 10, 2017, 02:59 PM »

You don't want post capture options (that's used if you want some external tool run automatically on every separate capture, like emailing it, etc.) .. you just want to do a scrolling capture.

You say 600 images but then 60 files. Is it 60 or 600?

60 images should be manageable with a single scrolling capture. 600 is a different matter.

Try the scrolling capture function. And I encourage you to try the MANUAL scrolling capture if you have trouble getting the automatic key sending options to work. Manual will will have you pressing PageDn and PrtScr after each capture. Then SC will help you stitch the final result together.

Try it with 10 pages and see how well that works before you try 60.

After that you'll have a single tall image, you can then save that to a pdf if you like, and OCR on that pdf, though it would seem more logical to OCR the image directly.

Ath · « **Reply #2 on:** January 10, 2017, 03:20 PM »

Not to downplay your enthusiastic use of Screenshot Captor, but wouldn't grabbing a list of items from a scrolling window be easier using NirSoft's SysExporter? (freeware). Except maybe if it's a .NET/WPF application, that uses non-standard controls that most low-level tools like SysExporter don't handle/support.
It will save you the errors that are bound to happen when OCR-ing images to text, and gain 'some' time in the extraction process.

mouser · « **Reply #3 on:** January 10, 2017, 03:37 PM »

Yes, I'll echo that -- if there is ANY way you can get those numbers short of having to use OCR, you should.

viau2555 · « **Reply #4 on:** January 11, 2017, 12:38 PM »

Thanks Ath, I tried your suggestion and found quite useful the SysExporter. Unfortunately, for my specific application it didn't work. The Tree-view window that I wanted extract did show-up in SysExporter main window showing 2717 items (which make total sense), but, even when selected, didn't show any items to export. I did some reading on the help file. Usage is pretty straightforward. I tried with other applications and SysExporter work well. I don't know what is wrong with my particular app. I suspect it include some well thought strategies for preventing data extraction. Any ideas about what I can try next ?

Thanks for your help

PS: By the way, it is 60ish pages per windows not 600. I have about 40 files containing from 30 to 2500 entries each, appearing in PageScroll of about 40 items per page depending on the window size. That is a lot of data.
If there is a way to avoid OCR and reading mistakes, I sure would prefer that.

Ath · « **Reply #5 on:** January 12, 2017, 01:40 AM »

I suspect it include some well thought strategies for preventing data extraction.
-viau2555 (January 11, 2017, 12:38 PM)

That sounds like the most viable cause. Not much to do about that then

IainB · « **Reply #6 on:** January 14, 2017, 01:31 PM »

@viau2555: I have for years been using "screen-scrapers" of one form or another, and I thus may be familiar with the context - at least - of the sort of problem you are working on.

Objectives:

The objective seems to be to capture an image of a scrolling screenshot of the whole on-screen report, or something.
Failing that, the objective seems to be to capture an image of a screenshot of each of the 60 or so pages of on-screen report data.
Post-capture, the objective is then to OCR any alphameric text in the captured image(s). You apparently intend to use the Adobe Acrobat for that, and I presume that the full version does that.

Possible approaches:

NirSoft's SysExporter: as @Ath suggested, this would probably be the best approach, as it would grab the actual data rather than an image of it. That doesn't seem to work in your case.
ScreenshotCaptor: You already tried that and it doesn't seem to work in your case, presumably because the report is in a proprietary output/display format, or something. If it were able to be displayed in a browser, then SC would probably be able to capture the image, and also the $FREE MS OneNote screen clipper - which seems to capture an entire scrollable web page with no hassle.
Individual page display capture: This seems to be where you are at, at present.

Without knowing more about the application you are trying to capture the data from, it is difficult to imagine what constraints you are operating under, but I would suggest that most data analysis/management applications have some sort of output functionality - e.g., (say) to export the data, or to output to printer as a file in PDF format. I presume you have explored this avenue.

Since you wish to get the data into a form you can independently manipulate it with - e.g., (say) Excel - then taking/exporting the data from the database directly in some way would be the best approach, as it will be error-free, whereas anything else (e.g., image capture and then OCR) would be undesirable as an alternative, simply because it will introduce errors. I presume you have explored this avenue.

However, if you are now stuck with the only option seeming to be the tedious capture of the 60 or so screen report images and then OCRing those, then I suppose that is what you will have to do.
Rather than try to design, build and test an automated image capture process, I would suggest that you do it manually and focus instead on identifying the best (least error-prone) OCR system to use, post-capture. You may find that Acrobat does not fall into that category for your requirements.

Ideally, you will need an OCR system that captures alphameric data in columnar form - assuming that the on-screen report has columns.
There may be others, but there is only one system that I have come across that:

(a) captures directly from the screen,
(b) then immediately OCRs the text and with great accuracy.

- and it can be very accurate and with few (if any) errors, where the captured image is clear and the characters in the text are unambiguous. That is ABBYY FineReader:

So I would suggest that you consider using that as you can capture each screen and OCR it in one step, for all 60 or so screens.

ABBYY FineReader is a professional OCR solution allowing you to convert scanned and photographed
documents as well as PDFs into editable formats.
______________________________

The special thing about this software is it's efficiency in use - that, in one step, it can capture tabulated on-screen text, OCR it and send it to Clipboard in tabular-spaced format for use in any document, or directly to a Microsoft Excel spreadsheet (if Excel is already installed).

ABBYY FineReader has been reviewed and discussed elsewhere in the DC Forum. It is frequently available and offered for $FREE as a promotion for other ABBY software. It usually comes as part of the ABBYY FineReader vX.0 Sprint suite of software bundled with some scanners.

Hope this helps or is of use.

Author Topic: ScreenshotCaptor - Auto Capture modes (Read 3625 times)

viau2555

ScreenshotCaptor - Auto Capture modes

mouser

Re: ScreenshotCaptor - Auto Capture modes

Ath

Re: ScreenshotCaptor - Auto Capture modes

mouser

Re: ScreenshotCaptor - Auto Capture modes

viau2555

Re: ScreenshotCaptor - Auto Capture modes

Ath

Re: ScreenshotCaptor - Auto Capture modes

IainB

Re: ScreenshotCaptor - Auto Capture modes