topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Friday April 19, 2024, 1:12 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Last post Author Topic: Controlling certain facts in a folder  (Read 13620 times)

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Controlling certain facts in a folder
« on: December 13, 2011, 01:58 AM »
Controlling certain facts in a folder

I have a folder containing subfolders and files.
I would like to find a software for :
a) Count the number of pages for file type. By example : number of pages of pdf files. Or even by group of files : number of files of pdf and doc files.
b) In audios file control the duration in minutes.
c) In video files control the duration in minutes. Able to sum durations of all video files inside the folder and subfolders.

I know some utilities to show the size of the files, but I would like with these special parameters.

Best Regards

Any partial solution is good for me too : a soft for counting pages of pdf files in a folder and subfolders.....

 :P

IainB

  • Supporting Member
  • Joined in 2008
  • **
  • Posts: 7,540
  • @Slartibartfarst
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #1 on: December 13, 2011, 03:09 AM »
Interesting questions for some. Non-trivial.
What do you intend to use the data (sums and counts) for? Does it matter how accurate it is?

I quite like the idea of being able to count or "stocktake" things like this.
It seems to be a classic accounting problem, but I don't have any experience or knowledge of how it might be done in this specific case. I suppose estimation could be a pragmatic approach - rather than actual physical counting, I mean.

Off the top of my head (so apologies if this seems a bit rushed):

Document files:
Depending on the accuracy required, I think it might be useful - if not necessary - for document files to have some definition.
For example:
  • to define what is meant by the unit "page" (e.g., A4, Legal, A2, A3, etc.) - so storage unit size would be defined.
  • to establish what languages/alphabets you will have in those documents (different alphabet systems may have different packing densities).
  • to define what font and point-size you are assuming is used - so density per page could be a concept.
  • to define average word-length.
  • to define what max, min and average word density would be estimated for the classification of a page-unit. (e.g., do you want to call something with only 5 words on it "A page"?)
  • to establish how to cope with pictures (images) in a document, and whether they cover a part of a page (and how much) or a whole page, have captions, headers, etc..
  • to establish how to cope with handwriting in a document.
  • to establish how to cope with documents (e.g., .PDF or Word files) which have no actual text but only images of pages with words on (this could imply the need for OCRing the documents).
  • do you need right-to-left or left-to-right reading/parsing, or both?
  • do you have landscape or portrait oriented pages, or both?
  • what to do with a frequency estimate for blank pages?

Then you might need to have (say) a function to define the typical density of words, by page.
Physical paper pages could be various sizes, but I suspect you'd have to define a normative/standard size.

Audio files:
Not really sure about these.
Should be able to use standard tags of some (e.g., mp3) to get duration (time). I'm not sure, but that might even be a file property for audio files - if so, then Windows Explorer would presumably be able to display it as a column in details view, same as file "Comments".

Video files:
Not sure at all about these.
Do they use standard tags for things like duration (time)? (I don't know.)

You might like to ask the question over at Quantified Self, where they have been looking at similarly knotty problems - e.g., Effect of One-Legged Standing on Sleep
Mind you, I reckon some of their theories haven't got a leg to stand on.
« Last Edit: December 13, 2011, 03:18 AM by IainB »

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #2 on: December 13, 2011, 08:57 AM »
b) In audios file control the duration in minutes.
c) In video files control the duration in minutes. Able to sum durations of all video files inside the folder and subfolders.

Check out PlayTime.

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #3 on: December 13, 2011, 04:38 PM »
Interesting questions for some. Non-trivial.
What do you intend to use the data (sums and counts) for? Does it matter how accurate it is?

I quite like the idea of being able to count or "stocktake" things like this.
It seems to be a classic accounting problem, but I don't have any experience or knowledge of how it might be done in this specific case. I suppose estimation could be a pragmatic approach - rather than actual physical counting, I mean.

Off the top of my head (so apologies if this seems a bit rushed):

Document files:
Depending on the accuracy required, I think it might be useful - if not necessary - for document files to have some definition.
For example:
  • to define what is meant by the unit "page" (e.g., A4, Legal, A2, A3, etc.) - so storage unit size would be defined.
  • to establish what languages/alphabets you will have in those documents (different alphabet systems may have different packing densities).
  • to define what font and point-size you are assuming is used - so density per page could be a concept.
  • to define average word-length.
  • to define what max, min and average word density would be estimated for the classification of a page-unit. (e.g., do you want to call something with only 5 words on it "A page"?)
  • to establish how to cope with pictures (images) in a document, and whether they cover a part of a page (and how much) or a whole page, have captions, headers, etc..
  • to establish how to cope with handwriting in a document.
  • to establish how to cope with documents (e.g., .PDF or Word files) which have no actual text but only images of pages with words on (this could imply the need for OCRing the documents).
  • do you need right-to-left or left-to-right reading/parsing, or both?
  • do you have landscape or portrait oriented pages, or both?
  • what to do with a frequency estimate for blank pages?

Then you might need to have (say) a function to define the typical density of words, by page.
Physical paper pages could be various sizes, but I suspect you'd have to define a normative/standard size.

Audio files:
Not really sure about these.
Should be able to use standard tags of some (e.g., mp3) to get duration (time). I'm not sure, but that might even be a file property for audio files - if so, then Windows Explorer would presumably be able to display it as a column in details view, same as file "Comments".

Video files:
Not sure at all about these.
Do they use standard tags for things like duration (time)? (I don't know.)

You might like to ask the question over at Quantified Self, where they have been looking at similarly knotty problems - e.g., Effect of One-Legged Standing on Sleep
Mind you, I reckon some of their theories haven't got a leg to stand on.

I have seen something of this in Google. Count the number of pages of pdf files containing in a folder. usually shareware.

What for ?
I have received a request from the judge of my city about a process I open with a usb key with digitalized documentation telling me I must present in 48 hours in written context.
With Windows explorer I have detected
362 pdf
88 word docs
216 eml files
671 jpg
497 png
4 amr files
etc.

« Last Edit: December 13, 2011, 04:59 PM by Contro »

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #4 on: December 13, 2011, 04:38 PM »
b) In audios file control the duration in minutes.
c) In video files control the duration in minutes. Able to sum durations of all video files inside the folder and subfolders.

Check out PlayTime.

I am going

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #5 on: December 13, 2011, 04:51 PM »
mmm skwire.
What happens with arm files from my nokia mobile ?
Dont detect.
How can I configure ?
Best Regards

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #6 on: December 13, 2011, 04:58 PM »
mmm skwire.
What happens with arm files from my nokia mobile ?
Dont detect.
How can I configure ?

You can't because PlayTime doesn't support that format.  For a list of formats it does support, look here:

http://mediainfo.sou...t/en/Support/Formats

Regarding PDF files, I can write you a quick app that will display how many pages a PDF file has (among other bits of PDF metadata). 

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #7 on: December 13, 2011, 05:00 PM »
mmm skwire.
What happens with arm files from my nokia mobile ?
Dont detect.
How can I configure ?

You can't because PlayTime doesn't support that format.  For a list of formats it does support, look here:

http://mediainfo.sou...t/en/Support/Formats

Regarding PDF files, I can write you a quick app that will display how many pages a PDF file has (among other bits of PDF metadata). 

Write please.
 :-*

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #8 on: December 14, 2011, 02:11 PM »
Write please.

Give this a shot:  PDFInfoGUI

main.pngControlling certain facts in a folder

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #9 on: December 14, 2011, 02:22 PM »
Write please.

Give this a shot:  PDFInfoGUI
 (see attachment in previous post)

Inmediately and will comment
 :-*
Best Regards

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #10 on: December 14, 2011, 02:56 PM »
I have obtain screenshots if necessary.
I have observed I obtain 355 files with PDFInfoGui analyzing the target folder
and with Explorer 363 files.
I don't know the reason but I will try what I am doing bad.
After export the results obtained with PDFInfoGui to a csv file I open that file with CSVed and export to excel. In excel I obtained the sum of all the pages : 1136 pages.
PDFInfoGui offer interesting information about the pdf files in the columns.
At the present moment I don't know why the number of files encountered is not equal with PDFInfoGui and Explorer. May be repeated files names.
 :P

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #11 on: December 14, 2011, 03:20 PM »
I am analyzing in deep.

Subfolder ----- PDFInfoGui -------- Explorer
CRONO                   354                    362
0487.11D003           355                    363
01.2010                     7                       7

This last folder correspond to 2010 January and obtain the same number.
I will discover where is the difference and will inform.
Best Regards

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #12 on: December 14, 2011, 03:23 PM »
I have observed I obtain 355 files with PDFInfoGui analyzing the target folder
and with Explorer 363 files.
I don't know the reason but I will try what I am doing bad.

It could be that the the pdfinfo.exe file doesn't recognise a few of the PDFs you have.  It seems to work for the most part, right?  I can add a feature which shows the total number of pages in the list if you'd like.

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #13 on: December 14, 2011, 03:35 PM »
01.2011             1               1
02.2010             2               2
02.2011             4               4
03.2010             8               8      (this subfolder contains two subfolders with only screenshots.....)
03.2011             1               1
04.2010            26              26      (this subfolder contains 4 subfolders. In the folder 04.2010 there are 4 pdf and the rest in the mentioned subfolders)
 :-*

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #14 on: December 14, 2011, 03:36 PM »
I have observed I obtain 355 files with PDFInfoGui analyzing the target folder
and with Explorer 363 files.
I don't know the reason but I will try what I am doing bad.

It could be that the the pdfinfo.exe file doesn't recognise a few of the PDFs you have.  It seems to work for the most part, right?  I can add a feature which shows the total number of pages in the list if you'd like.

Wonderful indeed.
As you see I have found at the present moment diversity of situations. And the same result.

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #15 on: December 14, 2011, 03:46 PM »
05.2010            8             8         This folder contains 6 pdf in the main folder and have 5 subfolders. In two of these subfolders have 2 additional pdf.
05.2011            1             1
06.2010            10           10         This folder contains 7 pdf and a subfolder with 2 and a new subolder in this subfolder with 1
 :P

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #16 on: December 14, 2011, 03:53 PM »
Are you able to post one of the PDFs that my application doesn't recognise?

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #17 on: December 14, 2011, 04:08 PM »
Here we are :
Subfolder   ------- PDFInfoGui --------- Explorer
07.2010                         37                   41
the main folder contains :
16 pdf.
                                                                               16                                                             16   (expected value, I'll make seperate proofs if necessary)
This subfolder contains ten folders more. One of them contains one subfolder. And this subfolder contains 6 subfolder more.
04.07.2010----------------------------------------------- 0 ------------------------------------------- 0
04.07.2010.pinza levantada ----------------------------     0 ---------------------------------------       0
04.07.2010.vuelco fotográfico móvil --------------------     0                                                               0
07.07.2010 -------------------------------------------      0                                                               0
08.07.2010 --------------------------------------------     0                                                               0
13.07.2010 ---------------------------------------------    0                                                              0
0416.10 Denuncia Orange ante AEPD ---------------------   20  -----------------------------------------  22
pant.envio.04.07.2010 telecos ---------------------------     0  -----------------------------------------   0
pant.envio.20.07.2010.telecos ----------------------------    0                                                              0
pant.envio.25.07.2010 ------------------------------------


I will continue in a moment.

But I think that I am making a mistake. For some reason windows explorer examine rar files containing pdf.
And seems this is the difference.
I will confirm after supper.
 :P

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #18 on: December 14, 2011, 04:09 PM »
Are you able to post one of the PDFs that my application doesn't recognise?

At the present moment I think your application recognize all pdf

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #19 on: December 14, 2011, 04:12 PM »
Here's the version with total page count in the statusbar:

PDFInfoGUI download

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #20 on: December 14, 2011, 05:42 PM »
Your application recognize all pdf.
I have to see why Windows Explorer option extract pdf from compressed files and count them.

Best Regards.

If possible I would like the option in your application to show the number of pdf pages.
 :-*

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #21 on: December 14, 2011, 05:42 PM »
Here's the version with total page count in the statusbar:

PDFInfoGUI download

Downloading
 :P

Contro

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 3,940
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #22 on: December 14, 2011, 05:49 PM »
Working and enjoying

Best Regards
http://img101.imageshack.us/img101/7251/besotene2.gif
Controlling certain facts in a folder
http://img101.imageshack.us/img101/7251/besotene2.gif
Controlling certain facts in a folder
http://img101.imageshack.us/img101/7251/besotene2.gif
Controlling certain facts in a folder

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #23 on: December 14, 2011, 09:21 PM »
Right on.  Thanks for testing.  I'll submit this as a NANY entry as well.

bob99

  • Supporting Member
  • Joined in 2008
  • **
  • default avatar
  • Posts: 345
    • View Profile
    • Donate to Member
Re: Controlling certain facts in a folder
« Reply #24 on: December 20, 2011, 11:19 AM »
skwire,

I am really liking PDFInfoGUI.   :Thmbsup:
Something I just noticed.  I am not seeing anything in the Page size, File size or PDF Version columns. If I launch one of the PDF's and go into Properties a file size is displayed.  But I don't see anything that has to do with page size or PDF Version (except Ghostscript). I have looked under the basic and advanced properties tab.

I'm using PDFXchange as my viewer. Deleted Adobe quite a while back.

Of the 3, the one I would like to be able to see is the File size.  

Thanks again for a great utility/program.
« Last Edit: December 20, 2011, 11:20 AM by bob99, Reason: typo »