What interests me is how it is able to infect other machines on the other side of the airgap. Somehow I don't think most computers routinely check their microphones for incoming data, let alone execute data recorded from the air.
-SeraphimLabs
Correct. The problem is primarily on machines that do have (and leave) the microphone enabled. The paper mentioned Skype and related users. But any infection that availed itself of this concept would not need to be limited to acoustic communications. It merely adds yet another vector for infection (as you noted) to the palette of methods we're already familiar with.
For example, suppose you could infect a machine in the usual way (i.e. wire, wireless, media, download), and covertly enable the sound system to transmit keystroke data. And also have the microphone listening for an "I'm listening" signal from a zombie routing machine to start playing it. Then that same zombie device could start recording and transmitting your keystrokes elsewhere via whatever network it's connected to for analysis and possible later use.
It's not so much what this can do now. But give it some time. Just sitting with some of my "in the biz" cronies, we came up with a few dozen viable ideas. That was without even trying. And none of us are real hacker types. Just imagine what the real professional 'naughty folks' will come up with.