Messages - bhuiraj [ switch to compact view ]

Pages: [1] 2 3 4 5next
1
How would I use/apply this? :)

I would hazard a guess that most apps that use a "one word per line" flat text file dictionary just suck the whole file into ram and split on the end of line marker.  For example AutoIt3 has the user defined function _FileReadToArray().  If the file fits in ram it's trivial. Each array element is a line of the file.  Most dictionaries I've used are less than 2 MB in size.

You haven't specified if you're using any particular software most often to access this data.


I use different pieces of software, so there isn't one specific one that I only use. All of them support this plain dictionary/wordlist/text file format.

Just for grins I started a thread on a db forum:

http://forums.databasejournal.com/showthread.php?p=129101#post129101

If anyone uses the terms "password cracking" or "dictionary attack" you're on your own!!



lol thanks :)

2
How would I use/apply this? :)

3
Out of curiosity, could you see the methodology?  I would think something that big would have to use some type of merge sort.  Esp. if you only have one disk that would be thrash city.

In cygwin's tmp folder, I noticed that the sort process created several smaller temporary files that were merged together in stages. My thought was that it splits the large dictionary file up into many smaller files of approximately the same size (some dictionaries were split up into multiple 64MB files), compares them, merges some of them together, compares again, and so forth until it ends up with two files that are half the size of the complete unduped file. Then, those two files are merged together to create the final dictionary with the duplicates removed.

4
In case anyone was wondering, it would take well in excess of a week to sort a 33GB dictionary. I started sorting my 33GB file on April 15th and finally cancelled it today (after 9 days) not even half done.

5
Thank you for all the help, guys :). I will continue to look for a solution and follow the w7 thread. Please continue posting if you have any thoughts or new ideas.

Pages: [1] 2 3 4 5next
Go to full version