i think it's really important to consider a critical fact in these tests:
they tend to be done with "out of the box" (default) settings in the firewalls.
this is extremely important, because it means that the tests tend to be a measure not of the firewalls potential ability to block things, but it's performance with default settings, which can be hugely different, depending on how the firewall is configured out of the box.
for example, some firewall makers try to avoid bothering the avg user with popups, so by default they allow stuff that an advanced user would change. other firewalls by default are overly paranoid, meaning tons of pop-ups, where an advanced user might adjust those settings.
so i take these scores with a grain of salt..