DonationCoder.com Forum
Main Area and Open Discussion => General Software Discussion => Topic started by: MerleOne on February 21, 2012, 08:31 AM
-
Hi,
I am looking for a virtual printer software such as doPDF, cutePDf, etc. except that the result of the output from this virtual printer would be a pdf identical to one generated by a scanned copy of the original document (no text inside, just a picture per page).
Do you know any software that does it, in one pass only ?
Thanks.
-
i'm not sure i understand what you are asking for.
-
Let say I have a word document. I want to create a PDF from this word document, that would look as a printout of the document, except I don't want it to be "structured", I want the pdf to be equivalent to the one created by a scanner (hardware + software) after I print the .doc to paper and scan the printed paper to a pdf.
In such cases, the pdf doesn't contain text but only images.
-
Ok i understand now what you want.
Perhaps a simpler way to say it is:
"I want to be able to print from an application (MS Word for example) to PDF -- but with each page stored as an image, so that no text is stored or recoverable from the PDF file."
-
Exactly, thanks !
I currently do the following : I use the Office Document Image Writer virtual printer to convert a .doc into a .tif, then irfanview to convert the .tif into a pdf, but there are 2 annoyances :
1/ 2 steps are required
2/ the output quality (tif to pdf) is quite poor (maybe a setting in an irfanview plugin ?)
Edited : the reason I want to use an image format rather than a text format is to avoid any possible error when converting the .doc file to the .pdf format, such as font missing and substituted by another one, etc. . So far, all "regular" word to pdf software show a few errors here and there. I really want a 100% perfect copy within the pdf, whatever font/text/image is used in the source document.
-
I googled "print to jpg" just out of curiousity, several things there.
If you could print directly to a jpg or tiff (with no loss of quality), may be you could write a script to send the graphic output to a pdf maker etc.
-
I googled "print to jpg" just out of curiousity, several things there.
If you could print directly to a jpg or tiff (with no loss of quality), may be you could write a script to send the graphic output to a pdf maker etc.
-AndyM
Thanks, I'll look into it.
-
2/ the output quality (tif to pdf) is quite poor (maybe a setting in an irfanview plugin ?)
-MerleOne
yes, re-check your settings. The quality can be surprisingly good, if you set them to be.
Edited:
This first gif screenshot made the long way from docx printed to tiff, converted via IrfanView to pdf:
[ You are not allowed to view attachments ]
---------------------
Screenshot of the original docx:
[ You are not allowed to view attachments ]
-
There are lots of programs (some pay & some free) that install a PDF printer driver. Just print from inside the app of your choice & the output goes to a PDF file rather than to a physical printer.
-
There are lots of programs (some pay & some free) that install a PDF printer driver. Just print from inside the app of your choice & the output goes to a PDF file rather than to a physical printer.-Innuendo
But most of them output to PDF in the format they were given, ie. text document input->text PDF output, image input->image PDF output
I think what MerleOne wants is no matter what the input, the PDF will always contain just images - what PDF Image Printer (http://www.peernet.com/pdf/), (not free and not cheap), does.
-
PriPrinter can do this (not free though)
The Standard version does - with full bitmap output support (including multiple pages per file if target file is tiff )
The Pro version adds PDF support, but I could not see if it supports complete bitmap formats inside the PDF file (i.e. no embedded fonts/images, only the bitmap equivalent)
-
Thanks all, will look into it !
-
I think what MerleOne wants is no matter what the input, the PDF will always contain just images - what PDF Image Printer (http://www.peernet.com/pdf/), (not free and not cheap), does.-4wd
-perfect for the job MerleOne described, but as you said, certainly "not free".
I am almost in a coma: $130 without support ($175 with support) ? !!!!
:o
-
Regarding PDF Image Printer, I am trying to install it within sandboxie, but it fails so far. I have to find another test machine... Will Try PriPrinter too...
-
2/ the output quality (tif to pdf) is quite poor (maybe a setting in an irfanview plugin ?)
...
yes, re-check your settings. The quality can be surprisingly good, if you set them to be.
...
-Curt
Thanks. You are referring to irfanview plugin settings ?
-
-yes. The first image in my first post (https://www.donationcoder.com/forum/index.php?topic=30037.msg279179#msg279179) was made exactly how you said:
I currently do the following : I use the Office Document Image Writer virtual printer to convert a .doc into a .tif, then irfanview to convert the .tif into a pdf, -MerleOne
-
Here's a little single purpose program in AutoIt that will:
a) monitor a directory for .TIF or .TIFF files
b) call nconvert to output a PDF when it sees one
eg. test.tif -> test.pdf
Put it in a directory along with nconvert.exe, (available from here (http://www.xnview.com/en/nconvert.html)).
Call it from the CLI, (you can close the CLI afterwards), like so: ncfet2p <dir>
The archive contains the executable, source, UDF and example.
Basically just a butchered version of the example script in the archive.
You can exit it using the context menu on the tray icon, (standard AutoIt icon).
NOTE: I tested it on XP using the x86 version of nconvert.exe but I don't see a reason why it shouldn't work on Vista/7 x86/x64.
Feel free to do what you like with it, change it to use IrfanView, (which I don't use), or something else but remember: If it formats your drive, I wasn't anywhere near it :)
Should mention, it will overwrite any existing PDF of the same name and it doesn't delete the original TIF.
UPDATE: Pops up a ToolTip for a few seconds when it start a conversion, (just to let you know it actually recognised a file).
-
PriPrinter can do this (not free though)
The Standard version does - with full bitmap output support (including multiple pages per file if target file is tiff )
The Pro version adds PDF support, but I could not see if it supports complete bitmap formats inside the PDF file (i.e. no embedded fonts/images, only the bitmap equivalent)
-NigelH
I use priPrinter Pro and the PDF is structured - not a bitmap.
The only options I can see are:
[ You are not allowed to view attachments ]
So you can't change the format of the output in the PDF (also checked the application options - no PDF options at all in there)
-
Thanks for the info Carol.
I wonder if Pelikan Software considered this might be a useful feature for PriPrinter
-
-yes. The first image in my first post (https://www.donationcoder.com/forum/index.php?topic=30037.msg279179#msg279179) was made exactly how you said:
-Curt
Ok, thanks. My question was also : where do you change the settings ?
-
I use priPrinter Pro and the PDF is structured - not a bitmap.
-Carol Haynes
Thanks for the info and for the link !
-
Can I ask why you want to do this? I find the structured nature of PDFs very useful and single image files a bit of a PITA. IS there a reason why the bitmap approach would be widely needed?
-
IS there a reason why the bitmap approach would be widely needed?
-Carol Haynes
Copy protection?, but I'm also quite curious about the OP's answer ;)
-
Can I ask why you want to do this? I find the structured nature of PDFs very useful and single image files a bit of a PITA. IS there a reason why the bitmap approach would be widely needed?
-Carol Haynes
Edited : the reason I want to use an image format rather than a text format is to avoid any possible error when converting the .doc file to the .pdf format, such as font missing and substituted by another one, etc. . So far, all "regular" word to pdf software show a few errors here and there. I really want a 100% perfect copy within the pdf, whatever font/text/image is used in the source document.
-MerleOne
-
Fair enough - but in some editors (acrobat?) you can embed the fonts you need to avoid this problem.
The problem with bitmap versions is if you want good quality printable versions of multipage documents they will get very large.
-
Edited : the reason I want to use an image format rather than a text format is to avoid any possible error when converting the .doc file to the .pdf format, such as font missing and substituted by another one, etc. . So far, all "regular" word to pdf software show a few errors here and there. I really want a 100% perfect copy within the pdf, whatever font/text/image is used in the source document.
-MerleOne
-tomos
How do you know the word to image process is perfect as you do not capture screen now?
Do you have some test doc file for share?
-
Fair enough - but in some editors (acrobat?) you can embed the fonts you need to avoid this problem.
The problem with bitmap versions is if you want good quality printable versions of multipage documents they will get very large.
-Carol Haynes
The real issue is to be sure that the PDF will be the exact copy of the original Word document. When there are many pages all with a lot of mathematical formulas, spotting manually an error due to a font or whatever other reason in a text structured PDF is virtually impossible. Granted, the size of created files could be an issue, but it appears more manageable than random errors that would slip through.
BTW, I tried PDF Image Printer and it's exactly what I want. Except for the price tag...
-
Here's a little single purpose program in AutoIt that will:
...
-4wd
Thanks a lot, will try it ASAP !
-
Thanks a lot, will try it ASAP !-MerleOne
Should have put them in a comment or something, the nconvert command/options used are:
-quiet [no CLI output - no strictly necessary since I hide the CLI window]
-multi [multipage output]
-c 4 [ZIP compression]
-out pdf [errr, PDF output]
I don't do any dpi/scaling/etc, so you may want to tweak the settings for enhanced output. Just change the following line to add options:
$nconcmd = 'nconvert.exe -quiet -multi -c 4 -out pdf -o ' & $newfile & ' ' & $file
-
Realized I misunderstood what was needed! :P
-
Resurrecting this topic:
@MerleOne: You may want to check out the deal that Curt posted: Debenu PDF Maximus FREE toady (https://www.donationcoder.com/forum/index.php?topic=31544.0)
Since it should do what you want, monitor a folder and turn an image into a PDF - it can also turn a normal PDF into an image PDF.
-
Resurrecting this topic:
@MerleOne: You may want to check out the deal that Curt posted: Debenu PDF Maximus FREE toady (https://www.donationcoder.com/forum/index.php?topic=31544.0)
Since it should do what you want, monitor a folder and turn an image into a PDF - it can also turn a normal PDF into an image PDF.
-4wd
Thanks for notifying me. I almost missed it ! I am not 100% sure it does what I want because I want to convert word documents with math formulas directly into a raster pdf. The only way to be sure formulas are correctly output. Here, if I have to use a structured pdf before, I think I'll encounter again the issue of formula translations. Anyway the only way is to try it out...
-
- what PDF Image Printer (http://www.peernet.com/pdf/), (not free and not cheap), does.-4wd
-perfect for the job MerleOne described, but as you said, certainly "not free".
I am almost in a coma: $130 without support ($175 with support) ? !!!! :o -Curt
The price has been lowered to $89.95 without- and $121.44 with- support (1-year only)
-but still way too expensive, I think, for a mere screenshot saved as pdf. Is it more?
-
I'm not sure, but I suspect that, to some extent, the price may reflect the sophistication of the design and the technology in use.
For example, some years back, I recall that a colleague of mine had a .PDF file from a project management "cartel" trade association that he was a member of - the "Project Management Association", or something like that.
Anyway, he wanted to print out the .PDF document onto paper, but could not, because printing had been deliberately disabled in the security settings of the .PDF file. So he could only open it in a .PDF reader. Yet he wanted two copies - one for his desk at the office, and one for reference at home. I think the association took advantage (read "ripped off") members by charging them an arm and a leg to sell them a hardcopy of the document, and he wanted to avoid that.
The document was the then current year's members' handbook, containing all the arcane mumbo-jumbo methodology of the association that initiates had to learn. This was supposedly "proprietary" to the association, but was actually nothing more than the usual collection of project management theory and so-called "best" or good practice that is taught in business school and which has lain in the public domain since Taylor/Gantt.
I saw that he had access in his office to a software called Omnipage (I think it was that), so I suggested to him that the simplest thing might be to use Omnipage to open and read the document, because I had read that Omnipage could blindly scan a document image once it had it in RAM in video/screen output format, OCR it and output it to a .PDF or MS Word document file in a reasonable likeness of the original.
So he did that, and in one pass he easily outputted it to a Word document file, and printed off two hardcopies for himself and a third for me - for giving him the idea in the first place.
I looked through it and it seemed to be a very good likeness of the original .PDF file - images, diagrams, tables and all, and there were only a few minor OCR errors. You could always parse the Word document file with a spellchecker and clear up the OCR errors, and fiddle with any image oddities.
I do not know whether any copyright was breached in the process, but what did strike me (as someone interested in all aspects of desktop publishing) was the sheer sophistication of the software that enabled you to do all this. Omnipage was pretty expensive to buy.
-
I'm not sure, but I suspect that, to some extent, the price may reflect the sophistication of the design and the technology in use.-IainB
-I hope you're right, IainB, because if I press Ctrl+Alt+PrtScr, FastStone Capture will create an accurate screenshot of the entire page, and offer me to merely click Save as Pdf - and I have a much better looking copy than any virtual pdf printer ever will give me - at a fraction of the price.
-
I'm pretty sure this idea was also discussed in another thread (with Contro requesting ?)
I'm not sure, but I suspect that, to some extent, the price may reflect the sophistication of the design and the technology in use.-IainB
-I hope you're right, IainB, because if I press Ctrl+Alt+PrtScr, FastStone Capture will create an accurate screenshot of the entire page, and offer me to merely click Save as Pdf - and I have a much better looking copy than any virtual pdf printer ever will give me - at a fraction of the price.-Curt
I'd imagine something like Omnipage would have the ability to do a higher resolution image though, and other stuff as Iain says.
if not looking for OCR you (one) could ask mouser if the ability could be added to screenshot captor, now that he has fancy scrolling capture, it could be captured at full-page-width in order to get better quality image. (There's already the request to save scrolling image shots as multiple files.)
-
...if I press Ctrl+Alt+PrtScr, FastStone Capture will create an accurate screenshot of the entire page, and offer me to merely click Save as Pdf - and I have a much better looking copy than any virtual pdf printer ever will give me - at a fraction of the price.
-Curt
If you only want a few pages, and you only want an image, then I think it might be best to stick with FastStone Capture or some other screenshot capture tool, rather than something like OmniPage. The book I was referring to was 300 or so pages long, and not what you'd really want to do page-by-page. OmniPage apparently just chuntered through the document in one pass.
I don't know that OmniPage did:
"...a higher resolution image..."
- because it did an OCR scan of the image in video/graphic RAM that had been read in from the .PDF file (quite clever really).
Going from digital-->digital-->OCR analogue though is likely to produce some errors. (The output was text and images in a Word document, don't forget.)
The suggestion of doing a scrolling screen capture of all the pages of a document in a .PDF reader would seem to have merit. I wondered about asking @mouser to add that to SSC (ScreenShot Captor) too, but didn't as it's not something I would want to do all that often. Nowadays, I could probably do something similar in OneNote otherwise.