New sound mixer | Catnap Games

The audio effects you get with OpenAL are:

pitch
pan
gain
that’s it

It’s a shame that OpenAL doesn’t provide more built-in effects or some equivalent of OpenGL’s shaders. HW accelerated or not – they could provide a lot of flexibility without having to rely on vendor-specific extensions.

With Superforce we had to make do with pitch shift where we really wanted a lowpass filter (in-game pause menu slows down the background music to 50%; this workaround actually turned out nice but lowpass or a combination of the two could have been even better). For the next game I decided to write my own audio mixer.

The actual sound mixing code turned out pretty well. Pitch shift was a bit tricky to get right. I also integrated the stb_vorbis decoder for streaming music. At first I was willing to keep using OpenAL as a backend since I already had it working. But OpenAL is a layer on top of Core Audio, so I couldn’t sleep well knowing that I could get rid of it…

So, Core Audio. With OpenAL, you play sounds. You can take your time. OpenAL will keep playing the sounds you already fired – no problem, no glitches. With Core Audio, it works the other way – you don’t call them, they call you. Like some sort of crime syndicate. And they will keep calling, whether you’re ready or not. You better be.

I got basic Core Audio output to work on the Mac pretty quickly but the setup wasn’t right. I was filling my mix buffer on the main thread and the Core Audio callback could come in to pick it up whenever it wanted. This was definitely going to cause problems when I’d start playing more sounds – sooner or later the callback was going to happen at a BAD TIME. Also, whenever the main thread got behind even a little bit during fullscreen transitions or loading, I could hear glitches – it didn’t fill the buffer quickly enough.

After a bit of research I settled on a more sophisticated approach. It is multithreaded and it gets me gapless playback even when loading a level or switching to fullscreen mode.

MAIN THREAD (GAME LOOP) → PLAY SOUND 1 → PLAY SOUND 2 → VSYNC, RENDER → REPEAT
MIXER THREAD → GET CURRENT SOUNDS → MIX INTO A FREE BUFFER → WAIT A BIT → REPEAT
CORE AUDIO CALLBACK → GET A FULL BUFFER FROM MIXER → USE IT TO FILL CORE AUDIO BUFFER → RETURN

The threads communicate using lock-free queues.

The game thread starts and stops sounds by sending commands into a lock-free queue.
The mixer thread reads from that queue and updates its internal information; then it gets a new empty buffer and mixes the sounds into it. When the buffer is filled it is put on another lock-free queue.
The Core Audio callback is called by the OS. It gets a full buffer from the buffer queue and just copies that into the destination Core Audio buffer. The size of the buffers coming from the mixer may differ from Core Audio, so it keeps a “cursor” inside the current source buffer.

There are several variables I can adjust:

Buffer size (currently 512 samples, matching Core Audio)
Buffer count (currently 3)
Mixer thread update frequency (currently 100Hz)

The latency is ~30ms, which is about two frames at 60FPS. I think that’s fine. I should probably add another thread dedicated to reading audio streams from disk and decoding to avoid problems with slow disk access. Some people still have HDDs…

In general, I’m quite happy with this. Ironically, my new mixer can still do only pitch, pan and gain but I now have the option to write my own sound effects, because I control the whole thing.