Welcome Guest.   Make a donation to an author on the site April 25, 2014, 02:55:08 AM  *

Please login or register.
Or did you miss your validation email?


Login with username and password (forgot your password?)
Why not become a lifetime supporting member of the site with a one-time donation of any amount? Your donation entitles you to a ton of additional benefits, including access to exclusive discounts and downloads, the ability to enter monthly free software drawings, and a single non-expiring license key for all of our programs.


You must sign up here before you can post and access some areas of the site. Registration is totally free and confidential.
 
View the new Member Awards and Badges page.
   
   Forum Home   Thread Marks Chat! Downloads Search Login Register  
Pages: [1] 2 3 Next   Go Down
  Reply  |  New Topic  |  Print  
Author Topic: extracting info from pdf  (Read 15315 times)
kalos
Member
**
Posts: 978

View Profile Give some DonationCredits to this forum member
« on: September 12, 2010, 12:58:27 PM »

hello

I am starting to work with PDF files and I would like to extract a table and save it as a graphics file or as a MS Office table (not as Excel, because it has symbols, lines, etc, it is not only numerical data)

what is the best way to do this? I will incorporate it in a PowerPoint or MS Office document

I need it to be in the original quality

I also need not to use a crop tool, because I need the optimum margins, etc, is there a way to extract that table in the default way?

thanks
« Last Edit: September 12, 2010, 01:01:01 PM by kalos » Logged
ha14
Participant
*
Posts: 264

View Profile Give some DonationCredits to this forum member
« Reply #1 on: September 12, 2010, 01:50:12 PM »

Hi

Able2Extract
http://www.investintech.com/able2extract.html
Logged
cmpm
Charter Member
***
Posts: 2,020

View Profile Give some DonationCredits to this forum member
« Reply #2 on: September 13, 2010, 07:25:47 AM »

Nitro's new free reader might do it.

http://www.nitroreader.com/

Claiming the ability to extract certain things.
Logged
Curt
Supporting Member
**
Posts: 6,262

see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #3 on: September 13, 2010, 12:42:03 PM »

Thanks for telling about this new reader, cmpm. Nitro's free reader is surprisingly good - especially at the price! It will however not load quite as fast as the advert makes you imagine, and for what I know, it can "merely" extract text and images, not 'tables'.
Logged
ha14
Participant
*
Posts: 264

View Profile Give some DonationCredits to this forum member
« Reply #4 on: September 13, 2010, 01:44:26 PM »

Try this FreeOCR
http://www.freeocr.net/

FreeOCR free software can recover text in the image of a printed text, but also a scanned sheet and even a PDF
this tutorial is in french but illustrated and easy to follow
http://www.pcastuces.com/...bureautique/ocr/page5.htm
Logged
daddydave
Supporting Member
**
Posts: 816



see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #5 on: September 13, 2010, 02:52:04 PM »

I need it to be in the original quality

IMO this requirement will push you to spend money (and even then, still no guarantee it will be in the original quality). PDF Converter 7 (the $49.99 version) appears to have the ability to "Convert PDF and XPS documents into all Microsoft Office formats in a click."  I have never used it, but I know some people swear by the Pro version of it.

Unfortunately, I am not seeing a demo version on the site  huh

And of course Adobe Acrobat, the full version has the ability to save to Office formats as well, but it costs around $273 minimum in the U.S.
« Last Edit: September 13, 2010, 02:58:07 PM by daddydave » Logged
rjbull
Charter Member
***
Posts: 2,701

View Profile Give some DonationCredits to this forum member
« Reply #6 on: September 13, 2010, 03:03:02 PM »


This is one of the Nuance ones.  I have Able2Extract.  Darwin kindly did a test on his Nuance Pro (more expensive) on the complicated front page of a World patent, for me to compare.  Nuance did a slightly better job, but there wasn't a lot in it.  Able2Extract in fact uses technology licensed from Nuance.
Logged
daddydave
Supporting Member
**
Posts: 816



see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #7 on: September 13, 2010, 03:11:09 PM »


This is one of the Nuance ones.  I have Able2Extract.  Darwin kindly did a test on his Nuance Pro (more expensive) on the complicated front page of a World patent, for me to compare.  Nuance did a slightly better job, but there wasn't a lot in it.  Able2Extract in fact uses technology licensed from Nuance.

Sweet. The test was with the free version of Able2Extract, I take it?
Logged
cyberdiva
Supporting Member
**
Posts: 887


see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #8 on: September 13, 2010, 03:45:40 PM »

Sweet. The test was with the free version of Able2Extract, I take it?
I didn't see any free version of Able2Extract when I went to their website.  Or was your statement meant tongue-in-cheek?
Logged
daddydave
Supporting Member
**
Posts: 816



see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #9 on: September 13, 2010, 03:53:40 PM »

Sweet. The test was with the free version of Able2Extract, I take it?
I didn't see any free version of Able2Extract when I went to their website.  Or was your statement meant tongue-in-cheek?

It wasn't meant as tongue-in-cheek but now it appears I was wrong. It looks like they have a $99.95 version and a $129.95 version, and you can also buy a 30 day demo for $34.95.  lol (Seriously!)
Logged
Curt
Supporting Member
**
Posts: 6,262

see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #10 on: September 13, 2010, 06:33:02 PM »

I need it to be in the original quality

IMO this requirement will push you to spend money (and even then, still no guarantee it will be in the original quality). PDF Converter 7 (the $49.99 version) appears to have the ability to "Convert PDF and XPS documents into all Microsoft Office formats in a click."  I have never used it, but I know some people swear by the Pro version of it.

-I believe the $100 PRO version is needed if .XPS must be included.

Nuance, compare features, (each screenshot approx 960 pixels wide)
Edited: not a good width without a widescreen! >The original pdf document< may suite your monitor better! [Pictures removed.]

Edit #2:
Actually, I imagine the $150 Enterprise version is needed, for the jobs kalos described -
Nuance is a converter, not an editor.
« Last Edit: September 13, 2010, 07:08:08 PM by Curt » Logged
daddydave
Supporting Member
**
Posts: 816



see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #11 on: September 13, 2010, 07:45:09 PM »

I believe the $100 PRO version is needed if .XPS must be included.

Then why does the web site say otherwise? I quoted directly from the description of the $49.99 product. At any rate, the original poster said nothing about XPS.
« Last Edit: September 13, 2010, 07:46:40 PM by daddydave » Logged
kalos
Member
**
Posts: 978

View Profile Give some DonationCredits to this forum member
« Reply #12 on: September 14, 2010, 10:43:45 AM »

is there a free PDF editor/converter that works well? many converters fail to convert properly to doc, even Acrobat
if not, any paid one that is really good?

thanks
Logged
ha14
Participant
*
Posts: 264

View Profile Give some DonationCredits to this forum member
« Reply #13 on: September 14, 2010, 01:30:40 PM »

try universal document converter http://www.print-driver.com/howto/
Logged
rjbull
Charter Member
***
Posts: 2,701

View Profile Give some DonationCredits to this forum member
« Reply #14 on: September 14, 2010, 02:52:04 PM »

The test was with the free version of Able2Extract, I take it?

No, painfully paid for.  Oh, and the reason why I chose Able2Extract over Nuance at the time - pre-test, as it happened - was because Nuance directed me to their UK shop, where they translate $US into ¬£UK one-to-one.  I might have missed a slightly better program, but they missed my business through greed and taking UK residents for fools.
Logged
daddydave
Supporting Member
**
Posts: 816



see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #15 on: September 14, 2010, 03:09:28 PM »

Oh, and the reason why I chose Able2Extract over Nuance at the time - pre-test, as it happened - was because Nuance directed me to their UK shop, where they translate $US into ¬£UK one-to-one.  I might have missed a slightly better program, but they missed my business through greed and taking UK residents for fools.

Ah...got it. Yes, I did wonder about that, thanks for clarifying.
Logged
rjbull
Charter Member
***
Posts: 2,701

View Profile Give some DonationCredits to this forum member
« Reply #16 on: September 14, 2010, 03:15:35 PM »

is there a free PDF editor/converter that works well? many converters fail to convert properly to doc, even Acrobat
if not, any paid one that is really good?

Like others, I doubt you will find a free tool to do what you want, especially with tables.  I've used XPDF for extracting text from PDFs.  It will also extract images and (I think) do a few other things.  You might try the free online file conversion service Zamzar, which I found quite good when I tried it.  There's at least one more similar service, Media-Convert.
Logged
daddydave
Supporting Member
**
Posts: 816



see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #17 on: September 14, 2010, 03:24:20 PM »

I should add I have experiemented with freeware tools to convert PDFs to HTML (a potential intermediate format for Word), but the results were abysmal so I don't really recommend that.
Logged
Perry Mowbray
N.A.N.Y. Organizer
Charter Member
***
Posts: 1,795



Thoughtful Scribbles

see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #18 on: September 14, 2010, 10:05:42 PM »

I have, at times, used OCR Terminal home page: Online OCR extensively, and like most (I think probably all) other software the results can be a little flaky and definitely need editing before using.

One job I was using for was converting data printouts into Excel spreadsheets (which had to go via word), and apart from mixing some of the combined cells up, it did pretty well. Still needed an edit as the original document was not perfect quality and in a small font size... but it saved quite a bit of time in the end.
« Last Edit: September 14, 2010, 10:08:40 PM by Perry Mowbray » Logged

StuR
Charter Member
***
Posts: 4

see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #19 on: September 17, 2010, 08:26:01 AM »

kalos,

I've had very good luck a number of times with http://www.pdftoword.com/default.aspx . Upload your pdf to their site, they convert it and return by email. Occasionally within a few minutes, always within 24 hours. And free!

Just two days ago at work I had them convert a 9-pg pdf with tables, highlighting, and a screenshot: the .doc version they returned was perfect.
Logged
Curt
Supporting Member
**
Posts: 6,262

see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #20 on: September 17, 2010, 10:27:13 AM »

-your first post after FIVE years' membership?!
Wow, you're not a man of too many words. Respect!  Thmbsup


... http://www.pdftoword.com/default.aspx ...
... the .doc version they returned was perfect.

-as it also would be if you use their desktop program: Nitro PDF Professional thumbs up
Logged
kalos
Member
**
Posts: 978

View Profile Give some DonationCredits to this forum member
« Reply #21 on: September 17, 2010, 10:33:40 AM »

-your first post after FIVE years' membership?!
Wow, you're not a man of too many words. Respect!  Thmbsup


that's an honour to post in my thread
Logged
StuR
Charter Member
***
Posts: 4

see users location on a map View Profile Give some DonationCredits to this forum member
« Reply #22 on: September 17, 2010, 10:38:14 AM »

Curt,

Yeah, I've been mostly a lurker here. I imagine there are a bunch of DC people like me: active and interested computer user both at home and at work, but not a coder, not a serious computer hobbyist, not too much interested in mucking about with hardware or modifying software or the like.

But I follow DC enthusiastically, and it's given me FARR and Screenshot Captor (which I use and talk up regularly) and a bunch of other interesting things.
Mouser is a god.

So I've held back not through reticence, but because most active posters have levels of knowledge and skill well above mine. When I feel I have something useful to add, I will.

(and now two posts in one day. It's a Trend!)

Logged
steveorg
Participant
*
Posts: 20


View Profile Give some DonationCredits to this forum member
« Reply #23 on: September 17, 2010, 04:19:01 PM »

Quote from: kalos
...extract a table and save it as a graphics file...

...I will incorporate it in a PowerPoint or MS Office document...

...I need it to be in the original quality...

...not to use a crop tool, because I need the optimum margins, etc

I'm focusing on the graphics approach that you mentioned because that just seems like the path of least resistance. Use Screenshot Captor to grab the image and your favorite graphics editor to adjust the margins to your liking.
Logged
kalos
Member
**
Posts: 978

View Profile Give some DonationCredits to this forum member
« Reply #24 on: September 20, 2010, 12:57:40 PM »

disadvantages of Screenshot Captor (and any other screenshot tool):

1) it captures the graphics not in the original/default resolution (if you zoom in/out the pdf, it will capture the graphics in different resolution) and this may result in a graphics without optimum resolution (too zoomed out may distort, too zoomed in may become unusable if you want it bigger)

2) it captures the graphics not in the original/default borders (since you drag the rectangular on your own) and this may result in a not well proportioned image

look this pdf editor how nice it recognizes and selects the image (see the blue borders):

but it cannot copy and paste in an MS Paint file! it cannot extract it!
Logged
Pages: [1] 2 3 Next   Go Up
  Reply  |  New Topic  |  Print  
 
Jump to:  
   Forum Home   Thread Marks Chat! Downloads Search Login Register  

DonationCoder.com | About Us
DonationCoder.com Forum | Powered by SMF
[ Page time: 0.052s | Server load: 0.06 ]