DonationCoder.com Forum

Main Area and Open Discussion => General Software Discussion => Topic started by: kalos on September 12, 2010, 12:58 PM

Title: extracting info from pdf
Post by: kalos on September 12, 2010, 12:58 PM
hello

I am starting to work with PDF files and I would like to extract a table and save it as a graphics file or as a MS Office table (not as Excel, because it has symbols, lines, etc, it is not only numerical data)

what is the best way to do this? I will incorporate it in a PowerPoint or MS Office document

I need it to be in the original quality

I also need not to use a crop tool, because I need the optimum margins, etc, is there a way to extract that table in the default way?

thanks
Title: Re: extracting info from pdf
Post by: ha14 on September 12, 2010, 01:50 PM
Hi

Able2Extract
http://www.investintech.com/able2extract.html
Title: Re: extracting info from pdf
Post by: cmpm on September 13, 2010, 07:25 AM
Nitro's new free reader might do it.

http://www.nitroreader.com/

Claiming the ability to extract certain things.
Title: Re: extracting info from pdf
Post by: Curt on September 13, 2010, 12:42 PM
Thanks for telling about this new reader, cmpm. Nitro's free reader is surprisingly good - especially at the price! It will however not load quite as fast as the advert makes you imagine, and for what I know, it can "merely" extract text and images, not 'tables'.
Title: Re: extracting info from pdf
Post by: ha14 on September 13, 2010, 01:44 PM
Try this FreeOCR
http://www.freeocr.net/

FreeOCR free software can recover text in the image of a printed text, but also a scanned sheet and even a PDF
this tutorial is in french but illustrated and easy to follow
http://www.pcastuces.com/pratique/bureautique/ocr/page5.htm
Title: Re: extracting info from pdf
Post by: daddydave on September 13, 2010, 02:52 PM
I need it to be in the original quality

IMO this requirement will push you to spend money (and even then, still no guarantee it will be in the original quality). PDF Converter 7  (http://www.nuance.com/for-business/by-product/pdf/index.htm) (the $49.99 version) appears to have the ability to "Convert PDF and XPS documents into all Microsoft Office formats in a click."  I have never used it, but I know some people swear by the Pro version of it.

Unfortunately, I am not seeing a demo version on the site  :huh:

And of course Adobe Acrobat, the full version has the ability to save to Office formats as well, but it costs around $273 minimum in the U.S.
Title: Re: extracting info from pdf
Post by: rjbull on September 13, 2010, 03:03 PM
PDF Converter 7  (http://www.nuance.com/for-business/by-product/pdf/index.htm)

This is one of the Nuance ones.  I have Able2Extract.  Darwin kindly did a test on his Nuance Pro (more expensive) on the complicated front page of a World patent, for me to compare.  Nuance did a slightly better job, but there wasn't a lot in it.  Able2Extract in fact uses technology licensed from Nuance.
Title: Re: extracting info from pdf
Post by: daddydave on September 13, 2010, 03:11 PM
PDF Converter 7  (http://www.nuance.com/for-business/by-product/pdf/index.htm)

This is one of the Nuance ones.  I have Able2Extract.  Darwin kindly did a test on his Nuance Pro (more expensive) on the complicated front page of a World patent, for me to compare.  Nuance did a slightly better job, but there wasn't a lot in it.  Able2Extract in fact uses technology licensed from Nuance.

Sweet. The test was with the free version of Able2Extract, I take it?
Title: Re: extracting info from pdf
Post by: cyberdiva on September 13, 2010, 03:45 PM
Sweet. The test was with the free version of Able2Extract, I take it?
I didn't see any free version of Able2Extract when I went to their website.  Or was your statement meant tongue-in-cheek?
Title: Re: extracting info from pdf
Post by: daddydave on September 13, 2010, 03:53 PM
Sweet. The test was with the free version of Able2Extract, I take it?
I didn't see any free version of Able2Extract when I went to their website.  Or was your statement meant tongue-in-cheek?

It wasn't meant as tongue-in-cheek but now it appears I was wrong. It looks like they have a $99.95 version and a $129.95 version, and you can also buy a 30 day demo for $34.95.  :lol: (Seriously!)
Title: Re: extracting info from pdf
Post by: Curt on September 13, 2010, 06:33 PM
I need it to be in the original quality

IMO this requirement will push you to spend money (and even then, still no guarantee it will be in the original quality). PDF Converter 7  (http://www.nuance.com/for-business/by-product/pdf/index.htm) (the $49.99 version) appears to have the ability to "Convert PDF and XPS documents into all Microsoft Office formats in a click."  I have never used it, but I know some people swear by the Pro version of it.

-I believe the $100 PRO version is needed if .XPS must be included.

Nuance, compare features, (each screenshot approx 960 pixels wide)
Edited: not a good width without a widescreen! >The original pdf document (http://www.nuance.com/ucmprod/groups/imaging/documents/collateral/nd_002765.pdf)< may suite your monitor better! [Pictures removed.]

Edit #2:
Actually, I imagine the $150 Enterprise version is needed, for the jobs kalos described -
Nuance is a converter, not an editor.
Title: Re: extracting info from pdf
Post by: daddydave on September 13, 2010, 07:45 PM
I believe the $100 PRO version is needed if .XPS must be included.

Then why does the web site say otherwise? I quoted directly from the description of the $49.99 product. At any rate, the original poster said nothing about XPS.
Title: Re: extracting info from pdf
Post by: kalos on September 14, 2010, 10:43 AM
is there a free PDF editor/converter that works well? many converters fail to convert properly to doc, even Acrobat
if not, any paid one that is really good?

thanks
Title: Re: extracting info from pdf
Post by: ha14 on September 14, 2010, 01:30 PM
try universal document converter http://www.print-driver.com/howto/
Title: Re: extracting info from pdf
Post by: rjbull on September 14, 2010, 02:52 PM
The test was with the free version of Able2Extract, I take it?

No, painfully paid for.  Oh, and the reason why I chose Able2Extract over Nuance at the time - pre-test, as it happened - was because Nuance directed me to their UK shop, where they translate $US into £UK one-to-one.  I might have missed a slightly better program, but they missed my business through greed and taking UK residents for fools.
Title: Re: extracting info from pdf
Post by: daddydave on September 14, 2010, 03:09 PM
Oh, and the reason why I chose Able2Extract over Nuance at the time - pre-test, as it happened - was because Nuance directed me to their UK shop, where they translate $US into £UK one-to-one.  I might have missed a slightly better program, but they missed my business through greed and taking UK residents for fools.

Ah...got it. Yes, I did wonder about that, thanks for clarifying.
Title: Re: extracting info from pdf
Post by: rjbull on September 14, 2010, 03:15 PM
is there a free PDF editor/converter that works well? many converters fail to convert properly to doc, even Acrobat
if not, any paid one that is really good?

Like others, I doubt you will find a free tool to do what you want, especially with tables.  I've used XPDF (http://www.foolabs.com/xpdf/) for extracting text from PDFs.  It will also extract images and (I think) do a few other things.  You might try the free online file conversion service Zamzar (http://www.zamzar.com/), which I found quite good when I tried it.  There's at least one more similar service, Media-Convert (http://media-convert.com/).
Title: Re: extracting info from pdf
Post by: daddydave on September 14, 2010, 03:24 PM
I should add I have experiemented with freeware tools to convert PDFs to HTML (a potential intermediate format for Word), but the results were abysmal so I don't really recommend that.
Title: Re: extracting info from pdf
Post by: Perry Mowbray on September 14, 2010, 10:05 PM
I have, at times, used OCR Terminal home page: Online OCR (https://www.ocrterminal.com/) extensively, and like most (I think probably all) other software the results can be a little flaky and definitely need editing before using.

One job I was using for was converting data printouts into Excel spreadsheets (which had to go via word), and apart from mixing some of the combined cells up, it did pretty well. Still needed an edit as the original document was not perfect quality and in a small font size... but it saved quite a bit of time in the end.
Title: Re: extracting info from pdf
Post by: StuR on September 17, 2010, 08:26 AM
kalos,

I've had very good luck a number of times with http://www.pdftoword.com/default.aspx . Upload your pdf to their site, they convert it and return by email. Occasionally within a few minutes, always within 24 hours. And free!

Just two days ago at work I had them convert a 9-pg pdf with tables, highlighting, and a screenshot: the .doc version they returned was perfect.
Title: Re: extracting info from pdf
Post by: Curt on September 17, 2010, 10:27 AM
-your first post after FIVE years' membership?!
Wow, you're not a man of too many words. Respect!  :Thmbsup:


... http://www.pdftoword.com/default.aspx ...
... the .doc version they returned was perfect.

-as it also would be if you use their desktop program: Nitro PDF Professional :up:
Title: Re: extracting info from pdf
Post by: kalos on September 17, 2010, 10:33 AM
-your first post after FIVE years' membership?!
Wow, you're not a man of too many words. Respect!  :Thmbsup:


that's an honour to post in my thread
Title: Re: extracting info from pdf
Post by: StuR on September 17, 2010, 10:38 AM
Curt,

Yeah, I've been mostly a lurker here. I imagine there are a bunch of DC people like me: active and interested computer user both at home and at work, but not a coder, not a serious computer hobbyist, not too much interested in mucking about with hardware or modifying software or the like.

But I follow DC enthusiastically, and it's given me FARR and Screenshot Captor (which I use and talk up regularly) and a bunch of other interesting things.
Mouser is a god.

So I've held back not through reticence, but because most active posters have levels of knowledge and skill well above mine. When I feel I have something useful to add, I will.

(and now two posts in one day. It's a Trend!)

Title: Re: extracting info from pdf
Post by: steveorg on September 17, 2010, 04:19 PM
...extract a table and save it as a graphics file...

...I will incorporate it in a PowerPoint or MS Office document...

...I need it to be in the original quality...

...not to use a crop tool, because I need the optimum margins, etc
-kalos

I'm focusing on the graphics approach that you mentioned because that just seems like the path of least resistance. Use Screenshot Captor to grab the image and your favorite graphics editor to adjust the margins to your liking.
Title: Re: extracting info from pdf
Post by: kalos on September 20, 2010, 12:57 PM
disadvantages of Screenshot Captor (and any other screenshot tool):

1) it captures the graphics not in the original/default resolution (if you zoom in/out the pdf, it will capture the graphics in different resolution) and this may result in a graphics without optimum resolution (too zoomed out may distort, too zoomed in may become unusable if you want it bigger)

2) it captures the graphics not in the original/default borders (since you drag the rectangular on your own) and this may result in a not well proportioned image

look this pdf editor how nice it recognizes and selects the image (see the blue borders):
[ You are not allowed to view attachments ]
but it cannot copy and paste in an MS Paint file! it cannot extract it!
Title: Re: extracting info from pdf
Post by: tomos on September 20, 2010, 04:16 PM
This may depend on the file - but with adobe reader I just selected an image in a pdf (via drag + select the area around it, then "copy image" in the context menu). I was able to paste the image in Evernote and MS Paint.

I dont think even the pdf reader will give you the option to show images at their original resolution - I suspect there could even be images with different resolutions within the one file. So I think you cant expect Screenshot captor (or other) to do that (I mean considering it's the pdf reader has the file open/displayed)
Title: Re: extracting info from pdf
Post by: kalos on September 20, 2010, 04:46 PM
This may depend on the file - but with adobe reader I just selected an image in a pdf (via drag + select the area around it, then "copy image" in the context menu). I was able to paste the image in Evernote and MS Paint.

this is screenshot capture tool, with the above mentioned disadvantages
Title: Re: extracting info from pdf
Post by: steveorg on September 20, 2010, 04:48 PM
I dont think even the pdf reader will give you the option to show images at their original resolution - I suspect there could even be images with different resolutions within the one file. So I think you cant expect Screenshot captor (or other) to do that (I mean considering it's the pdf reader has the file open/displayed)

I was going to make a similar point, but wanted to test it first. I'm far from an expert, but it has been my understanding (partly from experience) that a pdf rarely has enough data to extract components that are as detailed as the original source. On the contrary, the more efficient the pdf creation program, the smaller the file size. The pdf program should provide the least amount of data that is required to create the desired appearance.

For a bit mapped graphic, what you see is probably the best you'll get. I guess in theory, scalar graphics are more flexible, but you may need appropriate software. Fonts may also scale under the right circumstances.

This is the document version of "You can't go home again." :P

Title: Re: extracting info from pdf
Post by: tomos on September 20, 2010, 05:32 PM
This may depend on the file - but with adobe reader I just selected an image in a pdf (via drag + select the area around it, then "copy image" in the context menu). I was able to paste the image in Evernote and MS Paint.

this is screenshot capture tool, with the above mentioned disadvantages

ah okay, (you said 'pdf editor' in your post, above the screenshot)

My point stands though: if you want to get the best quality image, copy it out of the pdf reader. You cannot expect the screenshot app to manipulate the pdf reader to give the best possible display - especially if the pdf reader itself cannot even do this.

on the other hand, as steveorg says, most pdf creators are focused on making the file smaller so the original image quality in the pdf might not be so good anyways...
Title: Re: extracting info from pdf
Post by: kalos on September 21, 2010, 10:09 AM
My point stands though: if you want to get the best quality image, copy it out of the pdf reader. You cannot expect the screenshot app to manipulate the pdf reader to give the best possible display - especially if the pdf reader itself cannot even do this.
let's say ok
now let's be a bit practical
what is the best time to copy in order to have the graphics in best quality? when pdf file is zoomed at 150% ? at 200 % ? at 300 % ?
at 400% graphics starts to pixelized
at 70% graphics starts is too small
...

on the other hand, as steveorg says, most pdf creators are focused on making the file smaller so the original image quality in the pdf might not be so good anyways...

this doesn't matter, I just want the best image quality in the pdf file, not the image quality of the initial graphics file
Title: Re: extracting info from pdf
Post by: cmpm on September 21, 2010, 10:52 AM
If you could give an example pdf to do this operation, perhaps we all could experiment with the various tools each of us has.
Title: Re: extracting info from pdf
Post by: Curt on September 21, 2010, 11:06 AM
now let's be a bit practical
what is the best time to copy in order to have the graphics in best quality? when pdf file is zoomed at 150% ? at 200 % ? at 300 % ?

-neither.

1) Extract the pictures, if they actually are pictures and not just PDF generated background.
2) If they aren't genuine pictures, I would make a simple screenshot at 100%.


[ You are not allowed to view attachments ]



[ You are not allowed to view attachments ]


Title: Re: extracting info from pdf
Post by: TomD101 on September 22, 2010, 04:49 AM
Hello all,

When I was in need of a PDF->Office converter, I found after some testing the program SolidConverterPDF from www.soliddocuments.com.

I have no idea, how they do it, but the results are simply stunning. Of course, not everything is possible, but I converted manuals for devices like TV sets, DVD-recording machines, scientific books and whatnot. Just incredible.

The trial lets you convert 10 percent of the original document, max. 10 pages and adds a watermark.

I think, this is ok for testing. Prices start with $ 80 for a single user license.
The support is great, very personal and really able to solve problems.

Give it a try and no, I am not connected to them.

Thomas
Berlin, Germany
Title: Re: extracting info from pdf
Post by: Curt on September 22, 2010, 07:29 AM
-hello Thomas, and welcome back (again again!) :-)


Solid Doc's "WYSIWYG Content Extraction" really is quite impressive, (it sure made me consider a license for yet another program to be used once or twice a year... haha), it may actually be what was asked for by the starter of the thread. But another problem is that the same person seems to want everything for nothing, so I expect even the very mentioning of the price, was a turn-off!
Title: Re: extracting info from pdf
Post by: kalos on September 22, 2010, 12:18 PM
I don't have a problem with price, if a program can do what I want

zooming an A4 PDF at 100% makes the PDF not to fill the whole 15" screen
and then taking a screenshot at that zoom, results in a small low resolution photo, it doesnt maximize the info that the graphics file can contain
Title: Re: extracting info from pdf
Post by: tomos on September 22, 2010, 02:03 PM
I don't have a problem with price, if a program can do what I want

zooming an A4 PDF at 100% makes the PDF not to fill the whole 15" screen
and then taking a screenshot at that zoom, results in a small low resolution photo, it doesnt maximize the info that the graphics file can contain

I dont think you can say what is the best zoom. I personally make PDF's with 300-400dpi images and you will get many PDF's with 72pdi images.
I'd go as large as possible before the screenshot - unless you think the images looks better smaller which would probably rarely happen.

If you've no problem with price I'd try TomD101's suggestion cause I think you can (hopefully) do a lot better than going the screenshot route - especially if you will be doing this often. Then if something occasionally doesnt work you could grab a screenshot and insert it into the converted file.
Title: Re: extracting info from pdf
Post by: Curt on September 22, 2010, 02:54 PM
I don't have a problem with price, if a program can do what I want

-that is good.
I am sure more people than me are looking forward to read what you think of Solid Doc (http://www.soliddocuments.com/)'


Good luck on your way to your post number 300  :)
Can DonationCoder's forum do what you want?
Title: Re: extracting info from pdf
Post by: kalos on October 03, 2010, 05:49 AM
ok I test Solid PDF Tools (it is the most complete software from that company)
where is actually the WYSIWYG extractor???
so far I see just what other PDF editors have
Title: Re: extracting info from pdf
Post by: kalos on October 03, 2010, 12:30 PM
anyone???
I am in a hurry!!
Title: Re: extracting info from pdf
Post by: kalos on October 05, 2010, 09:19 AM
??????
Title: Re: extracting info from pdf
Post by: Curt on October 05, 2010, 10:35 AM
If you could give an example pdf to do this operation, perhaps we all could experiment with the various tools each of us has.
-plus of course a much more precise description of what the job is.
Title: Re: extracting info from pdf
Post by: kalos on October 07, 2010, 12:11 PM
but I already mentioned this

it is about extracting a photo, diagram, index, etc from a pdf file, but not by taking a screenshot that is not precise (since it varies with zoom value)

you told me that Solid PDF Tools offer this, to automatically recognize/select a table, graphics etc (all pdf editors do this) and to extract/save it as image file (none pdf editor does this, they only do it if you take a screenshot)

there is no way to work properly with pdf files, i wonder why they created such format, it is very frustrating

TO SUM UP:
i just need to be able to extract graphics, but to do so properly, which means:

1) in the optimum resolution (which means best possible quality, without distortion resulting from too big zoom, or without loss of quality resulting from too small zoom)
2) with the optimum borders (which means optimumly proportioned and not missing any area of the graphics, even if that area is empty)

also, i would like to be able to extract tables, diagrams etc in a format that i can easily replace their text, without damaging the format, architecture, etc of the graph, diagram, table, etc, but i bet this is too much for pdf format
Title: Re: extracting info from pdf
Post by: rjbull on October 07, 2010, 03:24 PM
Here is part of the manual for pdfimages, part of the XPDF (http://www.foolabs.com/xpdf/about.html) suite:

------------------------------------------------------------------------------
pdfimages(1)                                                      pdfimages(1)



NAME
       pdfimages  -  Portable  Document  Format (PDF) image extractor (version
       3.02)

SYNOPSIS
       pdfimages [options] PDF-file image-root

DESCRIPTION
       Pdfimages saves images from a Portable Document Format  (PDF)  file  as
       Portable Pixmap (PPM), Portable Bitmap (PBM), or JPEG files.

       Pdfimages  reads  the  PDF file, scans one or more pages, PDF-file, and
       writes one PPM, PBM, or JPEG file for each  image,  image-root-nnn.xxx,
       where  nnn  is  the image number and xxx is the image type (.ppm, .pbm,
       .jpg).

       NB: pdfimages extracts the raw image data from the  PDF  file,  without
       performing  any  additional  transforms.  Any rotation, clipping, color
       inversion, etc. done by the PDF content stream is ignored.
------------------------------------------------------------------------------
Title: Re: extracting info from pdf
Post by: Curt on October 07, 2010, 05:26 PM
Because of your request only, I have now tested Solid PDF Tools, and I must say that I cannot help thinking you may not yet have fully understood how to use the program. It can do all you asked for. If you still have the program installed, please watch the online tutorials, and read the manual. Remember that the program not will edit picture, Excel or Word files, it will only create them. (Look for a new folder!)

http://www.soliddocuments.com/info.htm?product=SolidPDFTools&id=233&frame=4&subject=CreatePDFtoExcel etcetera.

My Nitro PDF PRO OCR will also do what you ask for. My AnyBizSoft 5-in-1 PDF, as well.

[ You are not allowed to view attachments ]
Title: Re: extracting info from pdf
Post by: kalos on October 08, 2010, 11:04 AM
it was because of you that I tested Solid PDF Tools

wait, what procedure do you follow in Nitro PDF?

1)
Click EDIT, then click on the graphics you want to copy in the pdf file, then right click COPY, then paste in MS Paint?
if so, it doesn't work always, to be honest, it doesn't work with most graphics, maybe because the graphics are 'protected'

2)
Click "Snapshot" then drag to select an area then paste in MS Paint?
this way ALL the above mentioned problems occur (not optimum resolution, not optimum borders)

I am curious to reading your way with this
Title: Re: extracting info from pdf
Post by: Curt on October 08, 2010, 12:50 PM
no "copy" or "save as" or ..., but "extract"!

Extract tables, extract images, extract this and that:

[ You are not allowed to view attachments ]


Title: Re: extracting info from pdf
Post by: kalos on October 08, 2010, 01:28 PM
oh, i have already tested this!!!

it does nothing from this pdf, it doesn't extract anything!!!

http://ifile.it/78znxpc
Title: Re: extracting info from pdf
Post by: Curt on October 08, 2010, 02:29 PM
your test file does not contain any tables or pictures at all, so there is nothing to extract, except text. It MAY have been Excel tables when it was created, not now, but it was more likely made in Word and Emax Draw, or similar. The big figure is made of many small parts; each column is a figure, each letter is a figure, etcetera. I am sorry for you, but there is no way you can extract all this as a unit.
Title: Re: extracting info from pdf
Post by: kalos on October 08, 2010, 03:24 PM
I know, that's why I need a screenshot-like tool that will take advantage of pdf editor's abilities to mark the appropriate borders and recognize the exact area to be copied

i am also in search of a way to estimate the optimum resolution before taking the snapshot
Title: Re: extracting info from pdf
Post by: tomos on October 08, 2010, 04:07 PM
i am also in search of a way to estimate the optimum resolution before taking the snapshot

The optimum resolution for a screenshot would simply be the screen resolution [I presume that's what screenshot tools choose(?)]. If you want to take a screenshot of a vector image/graph/etc (as in your sample pdf) you could enlarge the image as much as you can before taking the screenshot. That's your best quality there.

It different with pixel images (jpg png gif etc) as I mentioned before (you're better off extracting them if possible).
Title: Re: extracting info from pdf
Post by: kalos on October 08, 2010, 04:25 PM
you could enlarge the image as much as you can before taking the screenshot. That's your best quality there.

how to calculate this? in which zoom can you extract the above scheme?
Title: Re: extracting info from pdf
Post by: tomos on October 08, 2010, 05:53 PM
you could enlarge the image as much as you can before taking the screenshot. That's your best quality there.

how to calculate this? in which zoom can you extract the above scheme?

sorry Kalos, I dont understand your question.

let me rephrase:
if the image is vector (NOT a pixel image) simply enlarge it as much as you can on your monitor, then take a screenshot.


[edit] you can then increase the dpi of the image to show it at a smaller size & higher quality [/edit]
Title: Re: extracting info from pdf
Post by: Curt on October 08, 2010, 07:46 PM
-the picture will not be vector, because kalos is talking about taking a screenshot; a bitmap. If the files are like the test file, I *imagine* size 150% will be fine. But it doesn't make sense to talk about borders; since there are no genuine pictures, there will be no natural borders.
Title: Re: extracting info from pdf
Post by: kalos on October 09, 2010, 02:20 AM
-the picture will not be vector, because kalos is talking about taking a screenshot; a bitmap. If the files are like the test file, I *imagine* size 150% will be fine. But it doesn't make sense to talk about borders; since there are no genuine pictures, there will be no natural borders.

indeed, one can "imagine" that 150% "would" be fine, these are the problems I need to overpass

as for natural borders, since there is no picture file ofcourse they don't exist, but I think PDF editors do great job in determining borders than an individual can do by draggin his mouse over the to be captured area
Title: Re: extracting info from pdf
Post by: tomos on October 09, 2010, 03:31 AM
if the image is vector (NOT a pixel image) simply enlarge it as much as you can on your monitor, then take a screenshot.

-the picture will not be vector, because kalos is talking about taking a screenshot; a bitmap.
[...]

Yes, of course the screenshot is going to be a pixel image.
But I'm talking about what the 'picture' is *in the PDF* - if it's a vector graph or image in the PDF, then do what I suggested above for best quality [screenshot] image.
Title: Re: extracting info from pdf
Post by: kalos on October 22, 2010, 10:55 AM
there is an amazing pdf editor that before taking the snapshot it asks you at which dpi you want to take it

but cant remember which editor is it

any hint?
Title: Re: extracting info from pdf
Post by: kalos on October 24, 2010, 05:46 AM
the problem is that when i need a high resolution snapshot and i zoom out much, the are that i want to capture goes out of the monitor, so i have to scroll

so there is a pdf editor that enables you to first specify the area you want to capture, then specify the resolution and it zooms outs itself and then takes the screenshot
Title: Re: extracting info from pdf
Post by: Curt on October 29, 2010, 10:09 AM
-zoom out, all you want, and use your screen capture application.
Title: Re: extracting info from pdf
Post by: tomos on October 29, 2010, 12:10 PM
-zoom out, all you want, and use your screen capture application.

or do it page width with a screen capture app that does a good scrolling capture
Title: Re: extracting info from pdf
Post by: kalos on October 29, 2010, 12:46 PM
"a screen capture app that does a good scrolling capture"
like??
Title: Re: extracting info from pdf
Post by: cranioscopical on October 29, 2010, 02:43 PM
"a screen capture app that does a good scrolling capture"
like??
Oh, come on now!
This one. (https://www.donationcoder.com/Software/Mouser/screenshotcaptor/index.html)
Title: Re: extracting info from pdf
Post by: Curt on October 29, 2010, 03:40 PM
-oh, such an uncertain route to go; he might end up becoming a supporting member!
Title: Re: extracting info from pdf
Post by: JavaJones on November 16, 2010, 08:26 PM
I'll admit I only skimmed this topic, so this may be wildly far of the mark, but what about opening in Photoshop and rasterizing at desired resolution, then cropping? Given that screenshot approaches are being talked about here, I take it having the tables, etc. in actual original format (i.e. being translated into a Word table) is not a requirement, thus the Photoshop approach should work well (if I've understood the requirement from reading first few posts and skimming the rest  :-[).

- Oshyan
Title: Re: extracting info from pdf
Post by: TomD101 on March 29, 2011, 12:06 PM
Hi all,
let's see, if anyone is still looking at this thread. I stumbled upon this while checking on answered postings.
1st: Can't pdf be saved as html? I think having done this. But it may have been with Acrobat Full version. It's not for the faint of purse. Advantage: text is html, images are saved separately.

2nd: One of the very good screenshot programs with a scrolling option is - apart from the incomparable screenshot captor - HyperSnap from hyperionics.com. Region with scrolling, windows with scrolling, text with scrolling, almost everything.
Give it a try, if you never heard about it.

Thomas
Berlin, Germany
Title: Re: extracting info from pdf
Post by: Curt on March 29, 2011, 06:13 PM
where did my post go? I am certain that I proofread it after uploading. Have DC been hit by something today?

--------
1st: Can't pdf be saved as html? ... Advantage: text is html, images are saved separately.

-good point, Thomas.

------------
I have a key for HyperSnap 6, but am "merely" using FastStone Capture, because it is so delightful user-friendly. However, HyperSnap 7 seems so far to become much more user-friendly than version 6 has been:


[ You are not allowed to view attachments ]


Version 7 beta tests started, and we currently have a poll for users to vote on new features to add to version 7 of HyperSnap. More information at HyperSnap Feedback and Beta Tests Forum (http://hyperionics.com/forum2/tt.aspx?forumid=9&p=1).