ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > General Software Discussion

extracting info from pdf

<< < (9/13) > >>

Curt:
If you could give an example pdf to do this operation, perhaps we all could experiment with the various tools each of us has.
-cmpm (September 21, 2010, 10:52 AM)
--- End quote ---
-plus of course a much more precise description of what the job is.

kalos:
but I already mentioned this

it is about extracting a photo, diagram, index, etc from a pdf file, but not by taking a screenshot that is not precise (since it varies with zoom value)

you told me that Solid PDF Tools offer this, to automatically recognize/select a table, graphics etc (all pdf editors do this) and to extract/save it as image file (none pdf editor does this, they only do it if you take a screenshot)

there is no way to work properly with pdf files, i wonder why they created such format, it is very frustrating

TO SUM UP:
i just need to be able to extract graphics, but to do so properly, which means:

1) in the optimum resolution (which means best possible quality, without distortion resulting from too big zoom, or without loss of quality resulting from too small zoom)
2) with the optimum borders (which means optimumly proportioned and not missing any area of the graphics, even if that area is empty)

also, i would like to be able to extract tables, diagrams etc in a format that i can easily replace their text, without damaging the format, architecture, etc of the graph, diagram, table, etc, but i bet this is too much for pdf format

rjbull:
Here is part of the manual for pdfimages, part of the XPDF suite:

------------------------------------------------------------------------------
pdfimages(1)                                                      pdfimages(1)



NAME
       pdfimages  -  Portable  Document  Format (PDF) image extractor (version
       3.02)

SYNOPSIS
       pdfimages [options] PDF-file image-root

DESCRIPTION
       Pdfimages saves images from a Portable Document Format  (PDF)  file  as
       Portable Pixmap (PPM), Portable Bitmap (PBM), or JPEG files.

       Pdfimages  reads  the  PDF file, scans one or more pages, PDF-file, and
       writes one PPM, PBM, or JPEG file for each  image,  image-root-nnn.xxx,
       where  nnn  is  the image number and xxx is the image type (.ppm, .pbm,
       .jpg).

       NB: pdfimages extracts the raw image data from the  PDF  file,  without
       performing  any  additional  transforms.  Any rotation, clipping, color
       inversion, etc. done by the PDF content stream is ignored.
------------------------------------------------------------------------------

Curt:
Because of your request only, I have now tested Solid PDF Tools, and I must say that I cannot help thinking you may not yet have fully understood how to use the program. It can do all you asked for. If you still have the program installed, please watch the online tutorials, and read the manual. Remember that the program not will edit picture, Excel or Word files, it will only create them. (Look for a new folder!)

http://www.soliddocuments.com/info.htm?product=SolidPDFTools&id=233&frame=4&subject=CreatePDFtoExcel etcetera.

My Nitro PDF PRO OCR will also do what you ask for. My AnyBizSoft 5-in-1 PDF, as well.

kalos:
it was because of you that I tested Solid PDF Tools

wait, what procedure do you follow in Nitro PDF?

1)
Click EDIT, then click on the graphics you want to copy in the pdf file, then right click COPY, then paste in MS Paint?
if so, it doesn't work always, to be honest, it doesn't work with most graphics, maybe because the graphics are 'protected'

2)
Click "Snapshot" then drag to select an area then paste in MS Paint?
this way ALL the above mentioned problems occur (not optimum resolution, not optimum borders)

I am curious to reading your way with this

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version