Welcome Guest.   Make a donation to an author on the site April 21, 2014, 05:28:54 AM  *

Please login or register.
Or did you miss your validation email?


Login with username and password (forgot your password?)
Why not become a lifetime supporting member of the site with a one-time donation of any amount? Your donation entitles you to a ton of additional benefits, including access to exclusive discounts and downloads, the ability to enter monthly free software drawings, and a single non-expiring license key for all of our programs.


You must sign up here before you can post and access some areas of the site. Registration is totally free and confidential.
 
Your Support Funds this Site: View the Supporter Yearbook.
   
   Forum Home   Thread Marks Chat! Downloads Search Login Register  
Pages: Prev 1 [2]   Go Down
  Reply  |  New Topic  |  Print  
Author Topic: Text Parsing and Output ( Result in Excel, load multiple DOC,RTF,Word Files )  (Read 10808 times)
Ath
Supporting Member
**
Posts: 2,132



see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #25 on: April 01, 2012, 03:37:17 PM »

Updated ScriptLineCounter to version 1.4.0.0

What's new/changed:
  • Added: PDF read capability
  • Added: Parsing FileFormat 3, as supplied in pdf, nearly correct, to be validated
  • Added: -oc (OutputContent) option, no ini setting, displays all read file-content to console when -v also is set
  • Added: -im (IgnoreMinimalScore) option, no ini setting
  • Added: -xe option, Extra Info sheet to excel file, listing all files, the recognition percentage and the file-format detected
  • Improved: Overhauled GUI options and layout, added url-label with link to DC-forum thread

TODO: (in priority-order)
  • Improve and expand the GUI interface (partially done)
  • Possibly add headers and footers to the output
  • Add some unexpected features smiley
  • Fix any bugs or issues reported (2 until now ohmy)
  • Handle pdf files for input
  • Handle extra file-format extracted from samples received
  • Write a Readme.txt file
  • Create a GUI interface

Download:
Logged

Ath
Supporting Member
**
Posts: 2,132



see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #26 on: April 07, 2012, 08:54:59 AM »

Updated ScriptLineCounter to version 1.4.1.0

What's new/changed:
  • Fixed: Exception when generating output but no episodes where found (no files?)
  • Improved: If a .doc file is actually a disguised .rtf file, then read it like rtf, and same for .docx
  • Improved: Remove non-breaking spaces from character names, as found in some .doc files
  • Added: ScriptLineCounter.exe built using Launch4j, to avoid having a Command prompt open during run, also disables Console output
  • Added: Messagebox feature when running from .exe, messages to console shown as messagebox when needed
  • Added: ScriptLineCounter-CharacterMapping.properties settings file to merge multiple characters into 1, for resolving some typo's, supports Unicode
  • Added: ScriptLineCounter-CharacterNames.properties settings file to replace [CharacterNames] section in ini file, supports Unicode
  • Added: ScriptLineCounter-IgnoreCharacters.properties settings file to replace [IgnoreCharacters] section in ini file, supports Unicode

TODO: (in priority-order)
  • Improve and expand the GUI interface (partially done)
  • Possibly add headers and footers to the output
  • Add some unexpected features smiley
  • Fix any bugs or issues reported (3 until now ohmy)
  • Handle pdf files for input
  • Handle extra file-format extracted from samples received
  • Write a Readme.txt file
  • Create a GUI interface

Download:

Screenshot:
A new screenshot would be appropriate, as it has been changed quite a bit since the previous one was taken



The Console section is marked to emphasize that it's disabled when running the .exe, as that has no Console capabilities.
Logged

Ath
Supporting Member
**
Posts: 2,132



see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #27 on: April 09, 2012, 05:11:19 AM »

Updated ScriptLineCounter to version 1.4.2.0

What's new/changed:
  • Added: OpenOffice/LibreOffice .odt and .ods read capability (minimal support)
  • Added: More (optional) sheets if -xe specified: Name mappings and Ignored names
  • Improved: Extra info sheet now shows percentages with 1 decimal place, and has extra columns for text-lines found and lines recognized
  • Improved: Refactorings in code

TODO: (in priority-order)
  • Replace own logging system by log4j (already required/used by some libs)
  • Improve and expand the GUI interface (partially done)
  • Possibly add headers and footers to the output
  • Add some unexpected features smiley
  • Fix any bugs or issues reported (3 until now ohmy)
  • Handle pdf files for input
  • Handle extra file-format extracted from samples received
  • Write a Readme.txt file
  • Create a GUI interface

Download:
Logged

Ath
Supporting Member
**
Posts: 2,132



see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #28 on: April 30, 2012, 09:27:47 AM »

Updated ScriptLineCounter to version 1.5.0.0

What's new/changed:
  • Changed: Replaced own simple logging system by calls to Log4J (infrastructure already required for other libs)

TODO: (in priority-order)
  • Improve and expand the GUI interface (partially done)
  • Possibly add headers and footers to the output
  • Add some unexpected features smiley
  • Fix any bugs or issues reported (3 until now ohmy)
  • Replace own logging system by log4j (already required/used by some libs)
  • Handle pdf files for input
  • Handle extra file-format extracted from samples received
  • Write a Readme.txt file
  • Create a GUI interface

Download:
Logged

Ath
Supporting Member
**
Posts: 2,132



see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #29 on: May 07, 2012, 04:00:29 PM »

Updated ScriptLineCounter to version 1.5.1.0

What's new/changed:
  • Changed: Updated POI Library 3.8-beta6 by 3.8 Release (You'll need the Initial/Full install)

TODO: (in priority-order)
  • Improve and expand the GUI interface (partially done)
  • Possibly add headers and footers to the output
  • Add some unexpected features smiley
  • Fix any bugs or issues reported (3 until now ohmy)
  • Replace own logging system by log4j (already required/used by some libs)
  • Handle pdf files for input
  • Handle extra file-format extracted from samples received
  • Write a Readme.txt file
  • Create a GUI interface

Download:
Logged

Ath
Supporting Member
**
Posts: 2,132



see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #30 on: November 18, 2012, 08:51:36 AM »

Updated ScriptLineCounter to version 1.6.0.0

What's new/changed:
  • Added: New FileFormat nr. 4, based on doc-files supplied by Saira, the original initiator of this tool.
      note: This format is quite similar to FileFormat 2, so it may need to be forcibly used on some files/filesets.
  • Added: Setting the FileFormat from the commandline using -ff parameter, or set from the ini file, as documented in the readme.
  • Changed: GUI now has the Fileformat combo box enabled, to force a specific file format to be used.
  • Improved: Some more robustness while handling the file contents.
  • Added: A warning in the readme file to NOT use Windows Notepad for editing the properties files, as it inserts a BOM in UTF-8 files, not supported by SLC.

TODO: (in priority-order)
  • Improve and expand the GUI interface (partially done)
  • Add some unexpected features smiley
  • Fix any bugs or issues reported (3 until now ohmy)
  • Possibly add headers and footers to the output
  • Replace own logging system by log4j (already required/used by some libs)
  • Handle pdf files for input
  • Handle extra file-format extracted from samples received (4 file-formats supported now)
  • Write a Readme.txt file
  • Create a GUI interface

Download:
Logged

Ath
Supporting Member
**
Posts: 2,132



see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #31 on: March 09, 2013, 06:49:23 AM »

Updated ScriptLineCounter to version 1.7.0.0

What's new/changed:
  • Added: New FileFormat nr. 5, for reading data from CSV files (comma-separated or tab-separated), having a first row with column-names, and a "Name" and optional "Text" column. Columnnames can be configured from the command-line and .ini, details in the readme file.
  • Changed: The -pe encoding parameter is now (also) applied for reading .txt and .csv files, if it has been set/changed from the command-line or GUI.

TODO: (in priority-order)
  • Fix any bugs or issues reported (3 until now ohmy)
  • Improve and expand the GUI interface (partially done)
  • Add some unexpected features smiley
  • Handle extra file-format extracted from samples received (5 file-formats supported now)
  • Possibly add headers and footers to the output
  • Replace own logging system by log4j (already required/used by some libs)
  • Handle pdf files for input
  • Write a Readme.text file
  • Create a GUI interface

Download:
Logged

Ath
Supporting Member
**
Posts: 2,132



see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #32 on: June 15, 2013, 05:42:38 AM »

Updated ScriptLineCounter to version 1.7.1.0

What's new/changed:
  • Added: The possibility to have .doc/.docx files processed like csv (file-format 5) if the content is a table with proper headings (use -cn and -ct command-line options to set the used headers)
  • Added: Command-line (-ci <lines>) and ini (skiplines) and gui option to ignore the first n lines of a file
  • Added: Command-line (-cm <max>) and ini (maxcharacternameparts) and gui option to specify the number of words a character-name can consist of. Default is 4
  • Changed: When using debug-level 3 or 4 (-3 / -4 command-line options) the files read are saved as text with the same name appended with .txt.tmp extension, useful for inspecting how SLC 'sees' the file

TODO: (in priority-order)
  • Fix any bugs or issues reported (3 until now ohmy)
  • Improve and expand the GUI interface (partially done)
  • Add some unexpected features smiley
  • Handle extra file-format extracted from samples received (5 file-formats supported now)
  • Possibly add headers and footers to the output
  • Replace own logging system by log4j (already required/used by some libs)
  • Handle pdf files for input
  • Write a Readme.text file
  • Create a GUI interface

Updated GUI screenshot:


Download:
Logged

Ath
Supporting Member
**
Posts: 2,132



see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #33 on: June 20, 2013, 08:33:19 AM »

Updated ScriptLineCounter to version 1.7.2.0

What's new/changed:
  • Added: Fileformat 5 parameters -cn and -ct now also accept column-numbers instead of a name. First column is 1.
  • Improved: GUI now has fields for Name and Text columns for Fileformat 5. There also the column numbers can be entered.

TODO: (in priority-order)
  • Fix any bugs or issues reported (4 until now ohmy)
  • Improve and expand the GUI interface (partially done)
  • Add some unexpected features smiley
  • Handle extra file-format extracted from samples received (5 file-formats supported now)
  • Possibly add headers and footers to the output
  • Replace own logging system by log4j (already required/used by some libs)
  • Handle pdf files for input
  • Write a Readme.text file
  • Create a GUI interface

Updated GUI screenshot:


Download:
« Last Edit: June 20, 2013, 08:39:25 AM by Ath » Logged

Ath
Supporting Member
**
Posts: 2,132



see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #34 on: June 22, 2013, 09:22:34 AM »

Updated ScriptLineCounter to version 1.7.2.1

What's new/changed:
  • Fixed: Input of Name and Text column name was not transferred correctly to the running instance from the GUI.
  • Improved: GUI layout was a bit stretched.

TODO: (in priority-order)
  • Fix any bugs or issues reported (5 until now ohmy)
  • Improve and expand the GUI interface (partially done)
  • Add some unexpected features smiley
  • Handle extra file-format extracted from samples received (5 file-formats supported now)
  • Possibly add headers and footers to the output
  • Replace own logging system by log4j (already required/used by some libs)
  • Handle pdf files for input
  • Write a Readme.text file
  • Create a GUI interface

Download:
Logged

Pages: Prev 1 [2]   Go Up
  Reply  |  New Topic  |  Print  
 
Jump to:  
   Forum Home   Thread Marks Chat! Downloads Search Login Register  

DonationCoder.com | About Us
DonationCoder.com Forum | Powered by SMF
[ Page time: 0.057s | Server load: 0.1 ]