ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > General Software Discussion

Does KGB Archiver really achieve high compression rates?

<< < (2/3) > >>

f0dder:
Slow compression can be justified the compression rate is very good, and that decompression is substantially faster...

tinjaw:
A think something that many people overlook is that many of today's data files are already compressed and further compression is going to take time and CPU cycles. For example mp3, jpg, gif, jar, and others are already compressed.

f0dder:
.jar are just .zip files - for download (as opposed to applet-on-a-site) use, I'd probably use STORE compression method, so the .jar's can be properly compressed :P

But yes, you do have a point, lots of file formats don't compress very well.

tinjaw:
Here is something I have been meaning to research. I assume programs like 7-zip, WinRAR and the like must have an option to take into consideration the file being compressed. I assume you could have a special compression setting that compresses stuff in different ways depending on the source file.

It might even be advantageous to take an look at the files beforehand and create an archive, putting the highly compressible stuff in front and the already compressed stuff in the back. It could then switch algorithms as it goes, creating chunks, text and stuff using one algorithm and mp3, jpg, jar, using another. Or does somebody already do this?

f0dder:
WinRAR sorts input files based on extension, and there's an additional "RarFiles.lst" so similar-in-content files are grouped together: ie., .html and .txt would be relatively "far away" if grouped just on extension, but they have similar (plain-text) format, and thus should be grouped together-ish. The file also has a "$default" entry where the group-by-extension logic goes, and after that basically uncompressable files are listed so they don't "poison" the compression dictionary.

Also, it has supported "input filtering" for a while, which basically makes some (100% reversible) translations on certain input formats (iirc .wav, .bmp and .exe are included) to achieve better compression for those formats.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version