topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Thursday March 28, 2024, 11:58 am
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Last post Author Topic: Looking for a "virtual scanner' software in one pass, outputting in PDF.  (Read 23273 times)

oversky

  • Participant
  • Joined in 2008
  • *
  • default avatar
  • Posts: 19
    • View Profile
    • Donate to Member
Edited : the reason I want to use an image format rather than a text format is to avoid any possible error when converting the .doc file to the .pdf format, such as font missing and substituted by another one, etc. .  So far, all "regular" word to pdf software show a few errors here and there.  I really want a 100% perfect copy within the pdf, whatever font/text/image is used in the source document.

How do you know the word to image process is perfect as you do not capture screen now?
Do you have some test doc file for share?

MerleOne

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 957
  • 4D thinking
    • View Profile
    • Read more about this member.
    • Donate to Member
Fair enough - but in some editors (acrobat?) you can embed the fonts you need to avoid this problem.

The problem with bitmap versions is if you want good quality printable versions of multipage documents they will get very large.

The real issue is to be sure that the PDF will be the exact copy of the original Word document.  When there are many pages all with a lot of mathematical formulas, spotting manually an error due to a font or whatever other reason in a text structured PDF is virtually impossible.  Granted, the size of created files could be an issue, but it appears more manageable than random errors that would slip through.

BTW, I tried PDF Image Printer and it's exactly what I want.  Except for the price tag...
.merle1.

MerleOne

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 957
  • 4D thinking
    • View Profile
    • Read more about this member.
    • Donate to Member
Here's a little single purpose program in AutoIt that will:
...

Thanks a lot, will try it ASAP !
.merle1.

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,641
    • View Profile
    • Donate to Member
Thanks a lot, will try it ASAP !

Should have put them in a comment or something, the nconvert command/options used are:

-quiet      [no CLI output - no strictly necessary since I hide the CLI window]
-multi       [multipage output]
-c 4         [ZIP compression]
-out pdf    [errr, PDF output]

I don't do any dpi/scaling/etc, so you may want to tweak the settings for enhanced output.  Just change the following line to add options:

   $nconcmd = 'nconvert.exe -quiet -multi -c 4 -out pdf -o ' & $newfile & ' ' & $file

dspelley

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 90
  • But it's a dry heat.
    • View Profile
    • Read more about this member.
    • Donate to Member
Realized I misunderstood what was needed!  :P
We are at the very beginning of time for the human race. It is not unreasonable that we grapple with problems. But there are tens of thousands of years in the future. Our responsibility is to do what we can, learn what we can, improve the solutions, and pass them on.
--- Richard Feynman (1918-1988)

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,641
    • View Profile
    • Donate to Member
Resurrecting this topic:

@MerleOne: You may want to check out the deal that Curt posted: Debenu PDF Maximus FREE toady

Since it should do what you want, monitor a folder and turn an image into a PDF - it can also turn a normal PDF into an image PDF.

MerleOne

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 957
  • 4D thinking
    • View Profile
    • Read more about this member.
    • Donate to Member
Resurrecting this topic:

@MerleOne: You may want to check out the deal that Curt posted: Debenu PDF Maximus FREE toady

Since it should do what you want, monitor a folder and turn an image into a PDF - it can also turn a normal PDF into an image PDF.

Thanks for notifying me.  I almost missed it !  I am not 100% sure it does what I want because I want to convert word documents with math formulas directly into a raster pdf.  The only way to be sure formulas are correctly output.  Here, if I have to use a structured pdf before, I think I'll encounter again the issue of formula translations.  Anyway the only way is to try it out...
.merle1.

Curt

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 7,566
    • View Profile
    • Donate to Member
- what PDF Image Printer, (not free and not cheap), does.
-perfect for the job MerleOne described, but as you said, certainly "not free".
I am almost in a coma: $130 without support ($175 with support) ? !!!!  :o

The price has been lowered to $89.95 without- and $121.44 with- support (1-year only)
-but still way too expensive, I think, for a mere screenshot saved as pdf. Is it more?



IainB

  • Supporting Member
  • Joined in 2008
  • **
  • Posts: 7,540
  • @Slartibartfarst
    • View Profile
    • Read more about this member.
    • Donate to Member
I'm not sure, but I suspect that, to some extent, the price may reflect the sophistication of the design and the technology in use.

For example, some years back, I recall that a colleague of mine had a .PDF file from a project management "cartel" trade association that he was a member of - the "Project Management Association", or something like that.
Anyway, he wanted to print out the .PDF document onto paper, but could not, because printing had been deliberately disabled in the security settings of the .PDF file. So he could only open it in a .PDF reader. Yet he wanted two copies - one for his desk at the office, and one for reference at home. I think the association took advantage (read "ripped off") members by charging them an arm and a leg to sell them a hardcopy of the document, and he wanted to avoid that.
The document was the then current year's members' handbook, containing all the arcane mumbo-jumbo methodology of the association that initiates had to learn. This was supposedly "proprietary" to the association, but was actually nothing more than the usual collection of project management theory and so-called "best" or good practice that is taught in business school and which has lain in the public domain since Taylor/Gantt.

I saw that he had access in his office to a software called Omnipage (I think it was that), so I suggested to him that the simplest thing might be to use Omnipage to open and read the document, because I had read that Omnipage could blindly scan a document image once it had it in RAM in video/screen output format, OCR it and output it to a .PDF or MS Word document file in a reasonable likeness of the original.
So he did that, and in one pass he easily outputted it to a Word document file, and printed off two hardcopies for himself and a third for me - for giving him the idea in the first place.
I looked through it and it seemed to be a very good likeness of the original .PDF file - images, diagrams, tables and all, and there were only a few minor OCR errors. You could always parse the Word document file with a spellchecker and clear up the OCR errors, and fiddle with any image oddities.

I do not know whether any copyright was breached in the process, but what did strike me (as someone interested in all aspects of desktop publishing) was the sheer sophistication of the software that enabled you to do all this. Omnipage was pretty expensive to buy.

Curt

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 7,566
    • View Profile
    • Donate to Member
I'm not sure, but I suspect that, to some extent, the price may reflect the sophistication of the design and the technology in use.

-I hope you're right, IainB, because if I press Ctrl+Alt+PrtScr, FastStone Capture will create an accurate screenshot of the entire page, and offer me to merely click Save as Pdf - and I have a much better looking copy than any virtual pdf printer ever will give me - at a fraction of the price.

tomos

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 11,959
    • View Profile
    • Donate to Member
I'm pretty sure this idea was also discussed in another thread (with Contro requesting ?)


I'm not sure, but I suspect that, to some extent, the price may reflect the sophistication of the design and the technology in use.

-I hope you're right, IainB, because if I press Ctrl+Alt+PrtScr, FastStone Capture will create an accurate screenshot of the entire page, and offer me to merely click Save as Pdf - and I have a much better looking copy than any virtual pdf printer ever will give me - at a fraction of the price.

I'd imagine something like Omnipage would have the ability to do a higher resolution image though, and other stuff as Iain says.

if not looking for OCR you (one) could ask mouser if the ability could be added to screenshot captor, now that he has fancy scrolling capture, it could be captured at full-page-width in order to get better quality image. (There's already the request to save scrolling image shots as multiple files.)
Tom

IainB

  • Supporting Member
  • Joined in 2008
  • **
  • Posts: 7,540
  • @Slartibartfarst
    • View Profile
    • Read more about this member.
    • Donate to Member
...if I press Ctrl+Alt+PrtScr, FastStone Capture will create an accurate screenshot of the entire page, and offer me to merely click Save as Pdf - and I have a much better looking copy than any virtual pdf printer ever will give me - at a fraction of the price.
If you only want a few pages, and you only want an image, then I think it might be best to stick with FastStone Capture or some other screenshot capture tool, rather than something like OmniPage. The book I was referring to was 300 or so pages long, and not what you'd really want to do page-by-page. OmniPage apparently just chuntered through the document in one pass.

I don't know that OmniPage did:
"...a higher resolution image..."
- because it did an OCR scan of the image in video/graphic RAM that had been read in from the .PDF file (quite clever really).
Going from digital-->digital-->OCR analogue though is likely to produce some errors. (The output was text and images in a Word document, don't forget.)

The suggestion of doing a scrolling screen capture of all the pages of a document in a .PDF reader would seem to have merit. I wondered about asking @mouser to add that to SSC (ScreenShot Captor) too, but didn't as it's not something I would want to do all that often. Nowadays, I could probably do something similar in OneNote otherwise.