topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Thursday March 28, 2024, 1:36 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: Count Number of Pages in PDF Files  (Read 49100 times)

nnebeel

  • Participant
  • Joined in 2013
  • *
  • default avatar
  • Posts: 5
    • View Profile
    • Donate to Member
Count Number of Pages in PDF Files
« on: April 30, 2013, 12:27 PM »
I'd like a small program that allows me to select multiple PDF files or a directory with PDF files (recursive) and see the total number of pages across the files.

For example, if I have folder A with ten three-page PDF files and folder B with five five-page PDF files, and I select both folders, the program would display a dialog box with the total number of pages in the PDF files (55 pages).

Optionally, it would be nice to have the program output a delimited text file with the file name, file size, and number of pages, kind of like a DIR command, but with page numbers as an added property. Then I could import it into the database/spreadsheet program of my choice.

CMD or GUI would be fine.
« Last Edit: April 30, 2013, 04:20 PM by nnebeel »

rjbull

  • Charter Member
  • Joined in 2005
  • ***
  • default avatar
  • Posts: 3,199
    • View Profile
    • Donate to Member
Re: Count Number of Pages in PDF Files
« Reply #1 on: April 30, 2013, 04:06 PM »
This only gets you part way:

Command-line pdfinfo from the XPDF suite will print the page count of an individual file, amongst much other data.  So can
PDFTK, but again one file at a time.

You could put either in a batch file for multiple PDFs, but you'd have to massage the output data to make it tidy.

nnebeel

  • Participant
  • Joined in 2013
  • *
  • default avatar
  • Posts: 5
    • View Profile
    • Donate to Member
Re: Count Number of Pages in PDF Files
« Reply #2 on: April 30, 2013, 04:52 PM »
You could put either in a batch file for multiple PDFs, but you'd have to massage the output data to make it tidy.

I had looked into using pdfinfo before, but the output would indeed need some massaging. The source code for pdfinfo is at https://github.com/flooved/pdf2image/blob/master/xpdf/pdfinfo.cc. When I use the command pdfinfo -meta [filename], I get the following output:

Code: Text [Select]
  1. Tagged:         no
  2. Form:           none
  3. Pages:          7
  4. Encrypted:      no
  5. Page size:      613 x 790 pts
  6. File size:      339400 bytes
  7. Optimized:      no
  8. PDF version:    1.4

Maybe someone could create a spin-off from this so that, instead of creating a eight-line block of text, it created a single line in a delimited (maybe comma-delimited) text file with column headers in the first line—something like this:

Code: Text [Select]
  1. Tagged,Form,Pages,Encrypted,Page width,Page height,File size,Optimized,PDF Version
  2. none,7,no,613,790,339400,no,1.4

Then it could easily be used in a batch file operation and/or in a GUI.

x16wda

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 888
  • what am I doing in this handbasket?
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: Count Number of Pages in PDF Files
« Reply #3 on: April 30, 2013, 08:07 PM »
Couldn't you just grep the output to find the Pages: line then use set to pull out the page count?  Set has some niceties you don't often think about.
vi vi vi - editor of the beast

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,641
    • View Profile
    • Donate to Member
Re: Count Number of Pages in PDF Files
« Reply #4 on: May 01, 2013, 02:23 AM »
If you're willing to add a couple of extra files, (besides pdfinfo.exe), to get it working here's a DOS batch/command file that will output:

"Filename", pages

Call as follows:

PDF-Pages.cmd <directory>

Include quotes around the directory path if required, eg. PDF-Pages.cmd "D:\My Documents"

Output goes to a file named output.txt.

Code: Text [Select]
  1. REM PDF-Pages.cmd
  2. @echo off
  3. del output.txt
  4. for /r %1 %%f in (*.pdf) do pdfinfo.exe -meta "%%f" >out.txt & echo "%%f", | tr.exe -d "\r\n" >>output.txt & find "Pages:" out.txt | tr.exe -d "\055\056\072[:alpha:][:space:]" >>output.txt & echo. >>output.txt
  5. del out.txt

You'll need tr.exe, libintl3.dll and libiconv2.dll from coreutils binaries and coreutils dependencies archives available here - put them in the same directory as the batch/command file.

Worked OK here over 380 PDF files in 21 sub-directories.

Anyway, it was an exercise in DOS :D

EDIT: Ooppss!  Just realised not what you're after  :-[
« Last Edit: May 01, 2013, 02:36 AM by 4wd »

IainB

  • Supporting Member
  • Joined in 2008
  • **
  • Posts: 7,540
  • @Slartibartfarst
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: Count Number of Pages in PDF Files
« Reply #5 on: May 01, 2013, 05:15 AM »
Another way to crack this could be to use SysExporter - see SysExporter - (Screen-scraping) Export data from Windows controls - Mini-Review
You could then drop the PDF files' details (including page counts) into Excel and perform arithmetic or other operations - whatever you wanted to do with the data.
For example, you could copy this sort of data into Excel, from Windows Explorer (I'm using xplorer² in this screenshot clip):

xplorer² - PDF file page numbers.png

You could have a filtered view of all the PDF files in a set of nested directories this way, displayed as a flat file in xplorer², and turf all that data into Excel as a table, using SysExporter, together with (say) the file path data. It would effectively be a snapshot database index of all your PDF files in that set of nested folders, with the metadata being whatever you had chosen as display columns for (say) file properties - e.g. including "Pages".
You could select any data from the Excel table and (say) feed it in to batch files as parameters.
« Last Edit: May 01, 2013, 05:26 AM by IainB »

nnebeel

  • Participant
  • Joined in 2013
  • *
  • default avatar
  • Posts: 5
    • View Profile
    • Donate to Member
Re: Count Number of Pages in PDF Files
« Reply #6 on: May 01, 2013, 01:10 PM »
Thank you all for your input! 4wd, your solution worked great! I only had to tweak it slightly to make it do exactly what I wanted. Here's the code I ended up using:

Code: Text [Select]
  1. REM PDF-Pages.cmd
  2. @echo off
  3. del output.txt
  4. for /r %1 %%f in (*.pdf) do pdfinfo.exe -meta "%%f" >out.txt & echo %%~nxf,| tr.exe -d "\r\n" >>output.txt & find "Pages:" out.txt | tr.exe -d "\055\056\072[:alpha:][:space:]" >>output.txt & echo. >>output.txt
  5. del out.txt

I just changed the output file name so that it would only include the file name instead of the full path, and I deleted the quotes around the file name and the space after the comma so that Excel would handle it more easily. These are obviously minor tweaks. I also realized that a comma might not be the best delimiter here because file names can include commas, too. But that would be easily fixed if I needed to.

Of course, we could also use TR to extract the other fields (if we wanted to), but this is a great solution for me! TR seems very powerful. I haven't had much use for GNU commands before. Now I have some new stuff to play with!

Thank you!

IanB, that sounds like a nifty solution, but I was hoping for something I could use without having to install anything.

x16wda, sure, GREP would work, too.

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: Count Number of Pages in PDF Files
« Reply #7 on: May 01, 2013, 01:48 PM »
You could also use my PDFInfoGUI application: http://skwire.dcmemb.../fp/?page=pdfinfogui

nnebeel

  • Participant
  • Joined in 2013
  • *
  • default avatar
  • Posts: 5
    • View Profile
    • Donate to Member
Re: Count Number of Pages in PDF Files
« Reply #8 on: May 01, 2013, 02:10 PM »
skwire, thanks for playing the trump card!  :Thmbsup:

PDFInfoGUI
Export CSV button? Check.
Folder recursion? Check.
Custom PDF file list? Check.
Separate fields for path and file name? Check.
Full document properties? Check.

Best tool I've found this month! Had to come to the experts to find it, though.

Thanks!

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: Count Number of Pages in PDF Files
« Reply #9 on: May 01, 2013, 03:05 PM »
Right on.  =]  BTW, welcome to the DonationCoder site and enjoy your stay.  =]

velvasi

  • Participant
  • Joined in 2010
  • *
  • default avatar
  • Posts: 1
    • View Profile
    • Donate to Member
Re: Count Number of Pages in PDF Files
« Reply #10 on: August 07, 2013, 09:23 PM »
Hi, I came across this post when I was searching for pdf page counter.  So, I thought of sharing it when I found a simple freeware, so that it would be useful for somebody in the future.  http://download.cnet...0743_4-75967133.html

Thank you!

senturion

  • Participant
  • Joined in 2014
  • *
  • Posts: 2
    • View Profile
    • Donate to Member
Re: Count Number of Pages in PDF Files
« Reply #11 on: July 21, 2014, 12:38 PM »
Thank you 4wd!!!


This greatly helped me. I'm now saving the script and binaries and dependencies into a zip for future use. I had also added a bit onto the batch to give me the file size in addition. Then when I open in Excel I'd convert it to kilobytes (as it is in bytes to begin with). Here is the script including file size:

REM PDF-Pages.cmd
@echo off
del output.txt
for /r %1 %%f in (*.pdf) do pdfinfo.exe -meta "%%f" >out.txt & echo "%%f", | tr.exe -d "\r\n" >>output.txt & find "Pages:" out.txt | tr.exe -d "\r\n\055\056\072[:alpha:][:space:]" >>output.txt & echo , | tr.exe -d "\r\n" >>output.txt & find "File size:" out.txt | tr.exe -d "\055\056\072[:space:][:alpha:]" >>output.txt & echo. >>output.txt
del out.txt

« Last Edit: July 21, 2014, 01:29 PM by senturion »

senturion

  • Participant
  • Joined in 2014
  • *
  • Posts: 2
    • View Profile
    • Donate to Member
Re: Count Number of Pages in PDF Files
« Reply #12 on: July 21, 2014, 01:50 PM »
Perhaps a couple things to note when you are running the above script in a top-level directory containing 58,000+ PDF documents:

A. This will take a while :)

B. Prepare for cmd.exe to start consuming memory. We're up to 2.97 GB used by CMD.exe at the moment. I may have to let this run overnight instead.  :P

scaver

  • Participant
  • Joined in 2015
  • *
  • default avatar
  • Posts: 1
    • View Profile
    • Donate to Member
Re: Count Number of Pages in PDF Files
« Reply #13 on: March 12, 2015, 01:49 AM »
Sorry for bumping up this threat, but I'm struggeling with the following.
I want to display the total number of pages of the pdfs of the (sub)folders as wel. So not only the final total.

Something like:
c:\pdfs\path1 83
c:\pdfs\path2 12
c:\pdfs\path2\subfolder 3
etc.

Is that possible?