NANY 2013 (https://www.donationcoder.com/forum/index.php?board=304.0) Entry Information
Application Name | pdfautomv (or pdfautomv Robot, haven't decided yet ;)) |
Version | 0.3 |
Short Description | Moves PDF files into directories depending on embedded text |
Supported OSes | Windows (and possibly Linux) |
Web Page | https://bitbucket.org/phitsc/pdfautomv |
Download Link | source available following above link. but I'll make a zip too. |
System Requirements | |
Version History | - 0.3 - Fixed a crash.
- 0.2 - Rule files now have to be UTF-8. Fixed crash with -v 3 option.
|
Description
pdfautomv will be a simple command line utility (although perfectly usable with a simple double-click on a desktop shortcut) for the paperless office aficionado. Its purpose is to move PDF files from one directory to another based on the text embedded in the PDF file. My own primary use case is as follows:
1. Put invoice, receipt, letter, bank statement, whatever on scanner
2. Start scanning process => this will produce a PDF file in directory A
3. Repeat 1 - 2 until everything is scanned and directory A is full of files like Document.pdf, Document001.pdf, Document002.pdf, etc.
4. Double-click shortcut to pdfautomv.rb => marvel how all the Document bla bla.pdf files get nicely and neatly renamed and sorted into the directories where they belong
Usage
Installation
The application will be written in Ruby. So the Ruby runtimes have to be installed if they are not already. The application itself is just one Ruby file.
Using the Application
The application will rely on some "rule" files which have to be supplied by the user. A rule file specifies what pdfautomv should look for in a PDF file and where to move it and how to rename it if it finds a matching PDF file.
Here's an example rule file:
[match]
079 123 45 67
[variables]
dateLong=(\d\d)\. (Januar|Februar|März|April|Mai|Juni|July|August|September|Oktober|November|Dezember) (\d\d\d\d)
dateShort:5=(\d\d)\.(\d\d)\.(\d\d)
[move]
\\server\pl-office\bills\<dateLong:3>\<dateLong:3>-<dateShort:2> Telco - mobile.PDF
Both the match and the variables are regular expressions. The variables can then be referenced in the move expression (in angle brackets). :n references regex captures. The matched text is available via the implicit <match> variable. The :5 after dateShort specifies that the 5th match should be assigned to the dateShort variable.