WinRAR sorts input files based on extension, and there's an additional "RarFiles.lst" so similar-in-content files are grouped together: ie., .html and .txt would be relatively "far away" if grouped just on extension, but they have similar (plain-text) format, and thus should be grouped together-ish. The file also has a "$default" entry where the group-by-extension logic goes, and after that basically uncompressable files are listed so they don't "poison" the compression dictionary.
Also, it has supported "input filtering" for a while, which basically makes some (100% reversible) translations on certain input formats (iirc .wav, .bmp and .exe are included) to achieve better compression for those formats.