it is not so much about speed, as it is about address space (for both, physical and virtual memory)
also, just loading your box full of ram wont help in terms of speedup after a certain point
one main reason this (switch to 64bit) was done is the server market, where you have to address enormous amounts of storage/memory.
another thing is: consumers expect of their software to work with any filesize, you want to open up a 2GB or 4GB file.. well that's not trivial, the 4GB limit of 32bit pointers can be "fixed" using 64bit offsets (slower on 32bit machines) but you can't load that _whole_ thing into your virtual memory, since that one isn't able to address it. no way fixing this without nasty workarounds.
the end-consumer is just used to pay off the development costs by buying high priced CPUs in the beginning.
once the chip manufacturing plant amortised itself and the manufacturers are able to better control power consumption the same chips will be used in embedded systems in a few years.
embedded systems are the real market, not consumer PCs or servers.
after all almost all manufactured processors world wide are used in the embedded market (try to think of something electronic you bought lately that doesn't have a processor (vacuum cleaners have em, your coffee machine of course, the dish washing maschine.. all of em)).
the embedded stuff just limps behind the others by a few years.
it basically comes down to this:
1) the server people needed more address space
2) the consumer is told they need 64bit (a few actually do, very few), and they are willing to pay a high price for the new CPUs, paying off the development (see gamers, they are willing to pay everything)
3) the embedded market will profit from cheap 64bit cores in a few years