ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

DonationCoder.com Software > Post New Requests Here

Text Parsing and Output ( Result in Excel, load multiple DOC,RTF,Word Files )

<< < (3/8) > >>

Ath:
I've finished a rough first version. It only handles .rtf files, and is command-line controlled for now, but it works :)

The pre-requisite is to have Sun/Oracle Java 6 installed, grab JRE 6 release 31 here. (Though not yet tested with Java 7, it should work fine)

The get the download from my (DC supplied) webspace, unzip into some directory, open a command prompt there, and use a command-line like:


--- Code: Text ---ScriptLineCounter -p scriptfilesdir -x scripttitle.xlsx
It expects a subdirectory called scriptfiles, as specified by -p (if spaces are involved use double-quotes around the name), processes all files with .rtf extension, and creates an Excel file (.xls or .xlsx as specified) in the current directory with the name supplied at the -x parameter.

I'll be working on .doc and .docx support, and a GUI for easier 'control'. And a readme also has to be supplied :tellme:

Questions/support in this thread for now, I'll officially release in a separate thread if it's a bit more mature :-[

Krishean:
Here are the missing functions, csv_escape_field was written in PHP, I translated it to javascript. I'm not sure when I'll have some free time to work on this again.


--- Code: Javascript ---function csv_escape_field(str){        return((str.indexOf(',')!=-1||str.indexOf('"')!=-1)?'"'+str.replace(/"/g,'""')+'"':str);} function print_r(array,return_val){        var output='',pad_char=' ',pad_val=4,        repeat_char=function(len,pad_char){var str='';for(var i=0;i<len;i++)str+=pad_char;return str;},        formatArray=function(obj,cur_depth,pad_val,pad_char){                var base_pad=repeat_char(pad_val*cur_depth,pad_char);                var thick_pad=repeat_char(pad_val*(cur_depth+1),pad_char);                var str='';                try{                if(typeof(obj) == 'object' && !(obj === null || obj === undefined)){                        var typ=new String(obj.constructor);                        str += typ.substring(typ.indexOf(' ')+1,typ.indexOf('('))+'\n' + base_pad + '(\n';                        for (var key in obj) {                                if(typeof(obj[key]) == 'object'){                                        str += thick_pad + '[' + key + '] => ' + formatArray(obj[key], cur_depth + 1, pad_val, pad_char) + '\n';                                }else{                                        str += thick_pad + '[' + key + '] => "' + obj[key] + '"\n';                                }                        }                        str+=base_pad+')';                }else str=obj.toString();                }catch(err){}                return str;        };        output=formatArray(array,0,pad_val,pad_char);        if(return_val!==true){echo(output);return true;}        return output;}

Ath:
I updated ScriptLineCounter to version 1.1.0.0

What's new/changed:

* Process .doc/.docx MS-Word files
* Process .txt files as a bonus
* Included a ScriptLineCounter-sample.ini file, copy to ScriptLineCounter.ini and enable the desired settings/titles
* Titles in the output .xls/.xlsx file can be set in the ScriptLineCounter.ini
* Improved error handling if one of the required lib/*.jar files is missing
* Corrected the scriptlinecounter.sh file, it still had Excel2Html stuff there :huh:
TODO: (in random order)

* Write a Readme.txt file
* Create a GUI interface
* Fix any bugs or issues reported (0 until now :))
Download:

* From the ScriptLineCounter webpage

Ath:
Updated ScriptLineCounter to version 1.1.0.1

What's new/changed:

* Fixed: Process .doc MS-Word files (left behind some debugging stuff :-[)
* Added: [IgnoreCharacters] section in the ScriptLineCounter.ini to ignore specific character names
TODO: (in random order)

* Write a Readme.txt file
* Create a GUI interface
* Fix any bugs or issues reported (1 until now :o)
Download:

* From the ScriptLineCounter webpage

Ath:
Updated ScriptLineCounter to version 1.2.0.0

What's new/changed:

* Improved: Spoken-lines detection
* Improved: Detection of Characters with accented (unicode) characters
* Improved: Episode-number detection
* Improved: Count line for all characters if multiple characters speak
* Added: Sort by total lines-count (-s) instead of in order of appearance (default)
* Added: Filter detected lines for a selection of characters to console (-m)
* Added: [CharacterNames] section in ScriptLineCounter.ini, for mapping Characters to Actor names
TODO: (in random order)

* Write a Readme.txt file
* Create a GUI interface
* Add some unexpected features :)
* Fix any bugs or issues reported (1 until now :o)
Download:

* From the ScriptLineCounter webpage

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version