ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

DonationCoder.com Software > Post New Requests Here

IDEA: Compare dictionary (text) files and remove duplicate entries

<< < (6/7) > >>

MilesAhead:
How would I use/apply this? :)
-bhuiraj (May 02, 2011, 05:33 PM)
--- End quote ---

I would hazard a guess that most apps that use a "one word per line" flat text file dictionary just suck the whole file into ram and split on the end of line marker.  For example AutoIt3 has the user defined function _FileReadToArray().  If the file fits in ram it's trivial. Each array element is a line of the file.  Most dictionaries I've used are less than 2 MB in size.

You haven't specified if you're using any particular software most often to access this data.

MilesAhead:
Just for grins I started a thread on a db forum:

http://forums.databasejournal.com/showthread.php?p=129101#post129101

If anyone uses the terms "password cracking" or "dictionary attack" you're on your own!!


bhuiraj:
How would I use/apply this? :)
-bhuiraj (May 02, 2011, 05:33 PM)
--- End quote ---

I would hazard a guess that most apps that use a "one word per line" flat text file dictionary just suck the whole file into ram and split on the end of line marker.  For example AutoIt3 has the user defined function _FileReadToArray().  If the file fits in ram it's trivial. Each array element is a line of the file.  Most dictionaries I've used are less than 2 MB in size.

You haven't specified if you're using any particular software most often to access this data.


-MilesAhead (May 02, 2011, 05:41 PM)
--- End quote ---
I use different pieces of software, so there isn't one specific one that I only use. All of them support this plain dictionary/wordlist/text file format.

Just for grins I started a thread on a db forum:

http://forums.databasejournal.com/showthread.php?p=129101#post129101

If anyone uses the terms "password cracking" or "dictionary attack" you're on your own!!



-MilesAhead (May 02, 2011, 06:13 PM)
--- End quote ---
lol thanks :)

DK2IT:
How would I use/apply this? :)
-bhuiraj (May 02, 2011, 05:33 PM)
--- End quote ---
Just use some DB manager for SQLite, like this SQLite Database Browser, or the command line version or there are many other programs.

I don't see what you suggested that I didn't already in this post:
https://www.donationcoder.com/forum/index.php?topic=26416.msg245865#msg245865
-MilesAhead (May 02, 2011, 05:36 PM)
--- End quote ---
Nothing of new, just a real implementation, because we don't know how fast is a DB with a keyword as a key. And I can say that is very fast and do not need so much ram, but need hard disk space. Maybe enterprise DB (like Oracle/MySQL/etc.) can handle GB of data better than SQLite, but the system is the same.
Of course, you must find the right program to handle, because some GUI App (like SQLite DB Browser) load the file into ram and need over 1GB for that file of 100MB. The command line version, need only about 3MB instead.

MilesAhead:
Nothing of new, just a real implementation, because we don't know how fast is a DB with a keyword as a key. And I can say that is very fast and do not need so much ram, but need hard disk space. Maybe enterprise DB (like Oracle/MySQL/etc.) can handle GB of data better than SQLite, but the system is the same.
Of course, you must find the right program to handle, because some GUI App (like SQLite DB Browser) load the file into ram and need over 1GB for that file of 100MB. The command line version, need only about 3MB instead.
--- End quote ---

I added a reply to the "ask the expert" thread I started saying there has to be a "worst case scenario" with keys-only db likely to be it.  That was yesterday. I see it hasn't cleared the moderator. I think they don't really want to bring up the Achilles Heel.  I doubt I'll see my reply.

To really test this out you should have some method that directly accesses the flat file.  Compare it for speed vs. overhead.  A dummy run of a few MB doesn't mean anything.  Just about any manipulation all in ram is going to be fast.  We need a comparison of db and non-db access say for an 8 GB flat file of words.  Then see what happens.

I would tend to guess the db overhead would not be worth the effort compared to direct flat file access and manipulation for simple search. Also I suspect if you made a 34 GB table of keys, the db would crash on the OP's machine. :)

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version