topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Thursday March 28, 2024, 6:29 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: Beta new version of TrID file identifier - now with batch scanning & renaming  (Read 20103 times)

Mark0

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 652
    • View Profile
    • Mark's home
    • Donate to Member
I just finished a new beta (but seems pretty stable) version of my freeware TrID file identifier, for anyone who want to try it.
It bases it's analysis on a library of definitions, that for each filetype lists a series of patterns and uniques tokens, so that it can guess to with a certain file is more similar, irregardless of the file's name & extension.
Here's a typical results:

C:\TrID>trid \windows\media\chimes.wav

TrID/32 - File Identifier v1.72b - (C) 2003-06 By M.Pontello
Definitions found:  1956
Analyzing...

Collecting data from file: \windows\media\chimes.wav
 50.0% (.WAV) RIFF/WAVe standard Audio (4008/2)
 49.9% (.) Generic RIFF container (4000/1)

The current/stable version of TrID use a library of XML files for the defs, so it take some time to load & parse them at startup. For this new version, instead, I used a binary container, so that now it's almost instantaneous. Also, now it's possible to scan an entire folder of files:

C:\TrID>trid \pbcc\bin\*.exe

TrID/32 - File Identifier v1.72b - (C) 2003-06 By M.Pontello
Definitions found:  1956
Analyzing...

File: \pbcc\bin\CCEdit.exe
 33.6% (.EXE) Win32 Executable PowerBASIC/Win 7.x (235131/25/18)

File: \pbcc\bin\PBCC.exe
 84.9% (.EXE) Win16 NE executable (generic) (34068/22/9)

File: \pbcc\bin\PBRes.exe
 61.2% (.EXE) WIN32 Executable PowerBASIC/CC 3.02 (393928/51/44)

File: \pbcc\bin\PBrow.exe
 38.0% (.EXE) Win32 Executable PowerBASIC/Win 7.x (235131/25/18)

File: \pbcc\bin\RC.exe
 72.1% (.EXE) Win32 Executable MS Visual C++ (generic) (37706/45/16)

Eventually TrID can also rename the scanned files adding the guessed filetype extensions.
This come useful, for example, when you have a bunch of files recovered with CHKDSK, and the type of each file isn't immediately clear.
So if you have a folder with files like:

FILE0001.CHK
FILE0002.CHK
FILE0003.CHK
...

Running something like:
C:\TrID>trid \myfolder\* -ae
will rename them to:

FILE0001.CHK.doc
FILE0002.CHK.xls
FILE0003.CHK.gif
...

Here's the download link: TrID 1.72b (290KB)
That include the TrID's executable and a package with defs for over 1.900 filetypes.
Just unpack in a folder, and run.

I plan to publish a stable version in a couple of days on TrID's page, as long as the updated versions of the companion tools (like the one that take the XML defs and create the single package, etc.).
A Linux port is also almost complete / ready, thanks to the migration of the code base from PowerBASIC to the free / Open Source FreeBASIC.

Hope it will be useful to someone.

P.S.
It's and will be free for personal / non profit use, off course.

Bye!

rjbull

  • Charter Member
  • Joined in 2005
  • ***
  • default avatar
  • Posts: 3,199
    • View Profile
    • Donate to Member
Mark0,

you might want to take a quick look at
Eric Phelps' UnCHK, which has a brief discussion of CHK files.  TrID will be a lot more powerful, though; it has so many more file types as standard.

[edit: link fixed, sorry]

« Last Edit: June 01, 2006, 05:57 AM by rjbull »

f0dder

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 9,153
  • [Well, THAT escalated quickly!]
    • View Profile
    • f0dder's place
    • Read more about this member.
    • Donate to Member
I think you need to fix the link, rjbull :)
- carpe noctem

Mark0

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 652
    • View Profile
    • Mark's home
    • Donate to Member
Thanks rjbull, I have already seen UnCHK (maybe searching for "file identifier" or read about it somewhere).

Bye!

Mark0

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 652
    • View Profile
    • Mark's home
    • Donate to Member
I uploaded a new beta, with some minor additions / fixes.
It now check for the defs package in the current dir first, and eventually on the TrID's exe folder.
Alternatively, it's possible to use a specific one trough a switch.

C:\TrID>trid -?

TrID/32 - File Identifier v1.74b - (C) 2003-06 By M.Pontello

Usage: TrID [path]<filespec(s)...> [-r:nn] [-v] [-p] [-w]
                                   [-d:file] [-?]

Where: <filespec> Files to identify/analyze
       -ae        Add guessed extension to filename
       -ns        Disable unique strings check
       -r:nn      Display the first nn matches (default: 5)
       -v         Verbose mode - display def name, author, etc.
       -d:file    Use the specified defs package
       -w         Wait for a key before exiting
       -?         This help!


Bye!

Mark0

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 652
    • View Profile
    • Mark's home
    • Donate to Member
Finished with the beta, that become the new v2.00.



Link: TrID file identifier

Bye!

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
looks great mark0

Mark0

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 652
    • View Profile
    • Mark's home
    • Donate to Member
Thanks mouser!

Here's another news, just added:



:D

Bye!

f0dder

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 9,153
  • [Well, THAT escalated quickly!]
    • View Profile
    • f0dder's place
    • Read more about this member.
    • Donate to Member
Cute - linux version wouldn't have happened with PowerBASIC :)
- carpe noctem

Mark0

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 652
    • View Profile
    • Mark's home
    • Donate to Member
Exactly!  :Thmbsup:

Bye!

Cavalcader

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 194
  • Live Long & Prosper
    • View Profile
    • Donate to Member
TrID is a cool program -- I've used the v1 series on occasion over the last few years. I'm looking forward to trying the new one!
My Linguistic Profile:
  40% General American English
  30% Yankee
  20% Dixie

What Kind of American English Do You Speak?

Mark0

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 652
    • View Profile
    • Mark's home
    • Donate to Member
I have updated TrID a couple of days ago. The v2.10 mainly fix the problem with filesize > 2GB (a classic!), and add an option switch to change files extensions (in addition to the existing one that added the guessed extension).
As before, it's available both as a Win32 and Linux executable.

Mark0.net - Soft - TrID

P.S. The library of TrID's definition now cover over 4.000 filetypes.

Cavalcader

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 194
  • Live Long & Prosper
    • View Profile
    • Donate to Member
Thanks for the news! Is the GUI version 1.80 still good?
My Linguistic Profile:
  40% General American English
  30% Yankee
  20% Dixie

What Kind of American English Do You Speak?

Mark0

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 652
    • View Profile
    • Mark's home
    • Donate to Member
Sure. The engine is still the same; just keep it fed with the newest definitions, and it's OK.

worstje

  • Honorary Member
  • Joined in 2009
  • **
  • Posts: 588
  • The Gent with the White Hat
    • View Profile
    • Donate to Member
I saw this tool before and wanted to comment, but I think I never did. I love this sort of tool - it's useful, full of byte-juggling and all that stuff. Way more fun than dorky GUI stuff. :)

I've been meaning to ask: how does TrID compare to builtin linux tools? If I recall properly linux has a file command that does the exact same thing, and there's probably a Windows port for that command too.

Mark0

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 652
    • View Profile
    • Mark's home
    • Donate to Member
I think that the main difference is how the library of filetypes is updated.
The idea with TrID was to develop something that had no fixed rules, and basically relied on definitions created by scanning a number of files of a certain type, and automatically detecting recurring patterns.

So, for example, you just have to get some ODT files (the more, the better, usually), run TrIDScan against them, and you endup with a new definition tailored for those files. Then you can edit it, add some info (filetype descriptions, and URL with reference info, etc.), maybe remove the "obviously unimportant patterns & strings" (due to some bytes that just happened to correspond in the small data set analyzed, eventually), and the job is done.

Since it's very easy to create new definitions, I think that probably TrID recognize more filetypes than "file" (the tool).
Anyway, TrID's approach is a simple one that does seems to give some good results. But It's certainly not perfect; for example it definitely not very good with text files, because it need at least some fixed patterns.

BTW, I'm working on something completely different that I believe will results in a much better & complete file identification system, but I haven't anything ready for prime time yet...

Cavalcader

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 194
  • Live Long & Prosper
    • View Profile
    • Donate to Member
Sure. The engine is still the same; just keep it fed with the newest definitions, and it's OK.
Are you saying that the GUI version doesn't have an issue with filesize > 2GB? Not that it's something I run into; just curious. :)

BTW, I'm working on something completely different that I believe will results in a much better & complete file identification system, but I haven't anything ready for prime time yet...
Looking forward to seeing it when it's ready.  8)
My Linguistic Profile:
  40% General American English
  30% Yankee
  20% Dixie

What Kind of American English Do You Speak?
« Last Edit: February 20, 2011, 02:11 AM by Cavalcader »

Mark0

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 652
    • View Profile
    • Mark's home
    • Donate to Member
Are you saying that the GUI version doesn't have an issue with filesize > 2GB? Not that it's something I run into; just curious. :)

Yes, there was/is no problems with big files for TrIDNet.