ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

DonationCoder.com Software > Post New Requests Here

IDEA: get all images, tables from a pdf and save them as pdfName-fig-number.png

(1/2) > >>

urlwolf:
When I read APA formatted articles, I have to go back and forth to see the figures and tables (they are at the end) between the place where the fig. is mentioned and the figure itself.

Acrobat has a function to do what I want:
advanced > export images.

However, it fails in many occasions. It says that only images that contain raster or bitmap data can be exported. Most papers don't.

What I do "by hand" is to use screencaptor to take them (crop them) and save them. I then use the Dopus picture viewer to see them when they are mentioned.

A program that could do this automatically would be a godsend for academics around the word.

jgpaiva:
I'm not sure if it can do it, but just take a look at Mark0's BitmapRip (Bitmap Ripper).
I think it might be exactly what you're looking for.

skrommel:
 :) How about opening the pdf file twice?

I couldn't get BitmapRipper to work, but Carol Haynes mentioned PDF Explorer from http://homepage.oniduo.pt/pdfe/downloads.html, and this on extracts images like a charm. Also, GhostView from http://www.seas.ucla.edu/~ee5cta/ghostView/ can convert pdf to jpg or tif, to ease the cutting and pasting.

Skrommel

urlwolf:
I think the problem is that some of the graph-tables are not bitmaps.
For example, is the author drew a chart in word, it would be a vector graph in the final pdf and will be ignored by these tools. Same with tables.

I guess opening two copies is not a bad idea, although it loses a lot of screen real space (and I'm on a laptop to start with).

Thanks a lot

rjbull:
Sounds like you might like to try XPDF (freeware), specifically this component of it


DESCRIPTION
       Pdfimages saves images from  a  Portable  Document  Format
       (PDF)  file  as  Portable  Pixmap  (PPM),  Portable Bitmap
       (PBM), or JPEG files.

       Pdfimages reads the PDF file, scans  one  or  more  pages,
       PDF-file,  and  writes one PPM, PBM, or JPEG file for each
       image, image-root-nnn.xxx, where nnn is the  image  number
       and xxx is the image type (.ppm, .pbm, .jpg).


--- End quote ---

Navigation

[0] Message Index

[#] Next page

Go to full version