Games tend to be more GPU than CPU bound, so testing those probably don't make too much sense (although some work is done in the graphics drivers, so at least that code can take advantage of more registers etc).
Btw back in the really old days, when NT was available for the 64bit Alpha processor, some 32bit x86 software ran faster emulated on the alpha than on native x86 hardware... but that was because the programs were mostly calling APIs, which of course had native code on the Alpha, with some thunking layer. Pretty impressive JITing they did for the rest of the app code, especially considering how long ago this was.
Anyway, back on subject - I agree with mouser that benchmarking should really be done on 32bit vs 64bit version of the same software, since 32bit code running under 64bit runs directly as 32bit code, and doesn't really have any advantages by running on a 64bit OS.