ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

News and Reviews > Mini-Reviews by Members

Review/Tips: "Scanning - VueScan and Associates" Pt.I: Intro & Bookscanning

<< < (5/10) > >>

Nod5:
Fantastic review brahman!  :Thmbsup:

Some remarks on book/document scanning:

My experience is that scanning to 400 dpi grayscale gives near perfect, OCR'able output after postprocessing. I've found no need for higher dpi for regular black and white text documents.

ScanTailor is excellent. A support forum is here: http://www.diybookscanner.org/forum/viewforum.php?f=8 . The amazing diybookscanner.org forum is an incredible general resource on the hardware and software process of book scanning, though most hardware focus is on cool camera based DIY scanner builds. A very helpful and friendly forum.

I have made TiffDjvuOcr, a windows frontend app (for tesseract, djvulibre and imagemagick) that inputs ScanTailor processed .tiffs and outputs an OCRed djvu in one drag and drop step
http://nod5.dcmembers.com/tiffdjvuocr.html

A new, similar but more powerful linux frontend is djvubind
http://code.google.com/p/djvubind/
http://www.diybookscanner.org/forum/viewtopic.php?f=3&t=521

Curt:
This review is plain awesome. Wow!

Thanks for explaining so many things.  ;)

brahman:
Oh great, so much interest in my little review and so many questions!  :Thmbsup:

Would you know by chance the degree of support for 64bit Windows OSses?  I tried to get my scanner working with the Vuescan trial on Win7/64bit without much luck (but there are no Visioneer 64 bit drivers either).  Are there some tricks available online anywhere?
-Lutz_ (August 13, 2010, 11:40 PM)
--- End quote ---

Hi Lutz,

until version 8.6.42 all scanners on 64bit windows only supported flatbed scanning. Since then Ed has added much more support for 64bit Windows, so that VueScan now supports transparency scanning on x64 for a lot of scanners. The limitations came from the restrictions of the WIA interface. But recently a lot of work has been done to transfer some functionality from TWAIN to the competing (but much more incomplete) Microsoft WIA API.

So in case your try with VueScan was a while back, please try again.

Also please look up your scanner model here:
http://www.hamrick.com/vuescan/vuescan.htm#visioneer

A lot of the Visioneer scanners are supported directly via VueScan's driver without the need to install the Visioneer driver. These are the steps to follow:
 

* Make sure you've installed VueScan and that you've answered 'Yes' about installing drivers

* Click the Start button in the lower left corner

* Click 'Computer' or 'My Computer' with the right mouse button and choose 'Properties'

* Click the 'Hardware' tab (not needed on Vista or 7) and then click 'Device Manager'

* Click the scanner with the right mouse button and choose 'Properties'

* Click the 'Driver' tab

* Click 'Uninstall'

* Reboot the computer

*
If Windows asks for a driver for the scanner, tell it to install it automatically or to look in c:\vuescan.

This should cause the driver for the scanner to be loaded properly.

Hope this helps!

brahman:
Hi Jim,

Very nice review! Thank you very much for this.
-J-Mac (August 14, 2010, 01:02 AM)
--- End quote ---
Thank you for reading it!

One question about VueScan: Can it auto-detect and scan/crop multiple photos on a scanner bed? My previous HP AIO's had this feature but it never worked well. The Canon PIXMA AIO's I am now using do this very well. I still have a few thousand old family photos to scan and can't possibly do it scanning one at a time!

--- End quote ---

In the last couple of months VueScan has improved its auto crop facility a lot. It will work if you select Crop>Multi crop with these options:



Though you will get *much* better results if you can scan directly from film. Some of the Pixma like the MP970, 980, and 990 support film very well with great results. Check here. You can usually load 6 or 8 images at a time in the supplied film holder. However, these scanners - being CMOS AIOs - do not have sufficient depth of field to support framed slides, but film will be OK.

In part III of my review I will go into more detail on color scanning with a lot more tips on it. To quickly get all the photos digitized I would also suggest you use VueScan's (professional version only) raw scanning capability so you only have to worry about putting the photos in your scanner. You can then fine tune the raw scans later at your convenience or even batch develop the entire collection in one swoop. This will also be covered in part III.

Good luck with this big task!

brahman:
Hi Nod5,

Fantastic review brahman!  :Thmbsup:
-Nod5 (August 14, 2010, 05:29 AM)
--- End quote ---
Thanks!

My experience is that scanning to 400 dpi grayscale gives near perfect, OCR'able output after postprocessing. I've found no need for higher dpi for regular black and white text documents.

--- End quote ---
Yes 400dpi gray is a good way to go. However, I recommend to scan in 300dpi gray then upscale to 600dpi *b/w* and OCR with that (as explained in the review). 300dpi will give you much faster scanning speed, since most modern scanners work natively at 300dpi and 600dpi, but not 400dpi so that if you want 400dpi output they scan at 600dpi and then have to downscale internally to 400dpi (there will be an important section in the upcoming parts on this entire point).

In my review above, I did not stress two important points enough:
1. OCR results are also dependent on the size of the font to be scanned. Pocket books with small fonts need higher resolution than hard covers with big fonts.
2. Tesseract is OK for keyword searches (I noticed you are still using v2.04 for your frontend - v3.0 prerelease gives much better results, you may want to look at it and consider supporting it) but I would NOT recommend it for *serious* OCR work. You need a specialized commercial OCR package like Abby Finereader which is better than Tesseract by leagues.

Here I have some special advice: *Any* version *above* Abbyy 7 (i.e. 8,9,10) is almost equally good (they only made some improvements in the interface and added some minor features, and even though they say accuracy has improved, really it hasn't - it is *equally good* in Finereader 8,9, and 10) - but in Europe you can legally buy full versions of 8 or 9 at rock bottom prices at some reputable software vendors. I have seen v8 for US$25.- and v9 for US$35.-. These are full versions with CD incldg shipping (but not to US). You can buy them through a friend and (s)he can simply give you the serial number and you are ready to go. V9 started supporting some form of DjVu I believe.

ScanTailor is excellent. A support forum is here: http://www.diybookscanner.org/forum/viewforum.php?f=8 . The amazing diybookscanner.org forum is an incredible general resource on the hardware and software process of book scanning, though most hardware focus is on cool camera based DIY scanner builds. A very helpful and friendly forum.

--- End quote ---
I agree, ST is good. But also don't forget ScanKromsator. It has more features. Did you know that the developer of ST is currently focussed on integrating DjVu output into the software? That would be great!  :Thmbsup:
BTW the last link in my review above brings the reader to the site you mentioned - I guess it's only obvious if one clicks on it. But thank you for linking directly to their ScanTailor thread.

I have made TiffDjvuOcr, a windows frontend app (for tesseract, djvulibre and imagemagick) that inputs ScanTailor processed .tiffs and outputs an OCRed djvu in one drag and drop step
http://nod5.dcmembers.com/tiffdjvuocr.html

--- End quote ---

Thank you for your excellent work on this front end. This is a great asset to the book scanning community.  :up:

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version