ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Other Software > Developer's Corner

Essays on Proper Storage of Site Passwords

<< < (5/5)

db90h:
The entire question is about entropy. This also goes for compression, though in a different manner.
-Renegade (June 13, 2012, 10:14 AM)
--- End quote ---

Indeed, Renegade is right, as always, but I wanted to comment on this when I got a chance, to elaborate on compression, since that is one field where I can claim expertise (being the author of more than one LZ77/LZSS derivative algorithm). Entropy in compression is different indeed, but similar too. In compression, it of course represents the minimum theoretical size you can squeeze the data into, with it remaining in-tact (reconstructable in decompression without loss).

In compression though, passing data through more than one compression algorithm does *not* improve entropy. In fact, it may decrease it.

Now, you can pass it through different pre-processing algorithms that re-arrange the data and THEN compress it, which improves entropy, but most compression algorithms have these pre-processing algorithms built in. And those are not compression algorithms, they are pre-processing/re-arranging algorithms. For example, with PECompact, by making tweaks to x86 code before compression, the compression ratio can be improved by 20% in many cases, depending on the code (could be more, could be less). LZMA now has this pre-processor (known as BCJ2) built in. There are MANY more that target different types of data. By making these tweaks, you improve the chances for a 'match' in dictionary based compression (where it matches data it has already seen, and emits a backwards reference to that data, there-by saving space).

My POINT is to MAKE SURE that nobody misunderstands Renegade's accurate and wise comment as meaning they should pass their data through more than one compression algorithms. I *hate* seeing this, ZIPs inside of RARs, inside of ZIPs, etc.. absurd. Don't anybody do that, please ;).

Renegade:
My POINT is to MAKE SURE that nobody misunderstands Renegade's accurate and wise comment as meaning they should pass their data through more than one compression algorithms. I *hate* seeing this, ZIPs inside of RARs, inside of ZIPs, etc.. absurd. Don't anybody do that, please ;).
-db90h (June 13, 2012, 01:50 PM)
--- End quote ---

Ooops. Sorry about that. You're quite right. Successive compression doesn't guarantee size reduction, and in fact often results in larger file sizes. I didn't clarify that properly and left it open there to the wrong impression.

Thanks for the clarification there~! :D

Navigation

[0] Message Index

[*] Previous page

Go to full version