Realtime Audio

Real-time I/O and I/O APIs

Portable (aka cross-platform) APIs
- Portaudio
- RtAudio
- WebAudio (Browser)
- Jack (includes the notion of routing/patching)
“Native” APIs (we use these when we must)
- ALSA (Linux)
- CoreAudio (macOS/iOS)
- ASIO (Windows)
- …lots more

Callback versus blocking interfaces

Here’s what a simplified callback-based audio program might look:

float gain = 0.5;
int process(float *in, float *out, int frameCount, int channelCount, int sampleRate) {
  for (int i = 0; i < frameCount; ++i) {
    for (int channel = 0; channel < channelCount; ++channel) {
      *out = *in * gain;
      out++;
      in++;
    }
  }
} 
int main() {
  setupCallback(process, 1024, 2, 44100);
  getchar();
}

Using a callback generally means…

Parallel programming
More difficult to code/understand
More control

Here’s what a simplified blocking I/O-based audio program might look:

int main() {
  int frameCount = 1024;
  int channelCount = 2;
  int sampleRate = 44100;
  Stream stream = setupStream(frameCount, channelCount, sampleRate);

  float gain = 0.5;
  
  while (!done) {
    float *buffer = stream.readBlock();
    for (int i = 0; i < frameCount; ++i) {
      for (int channel = 0; channel < channelCount; ++channel) {
        *buffer = *buffer * gain;
        buffer++;
      }
    }
    stream.writeBlock(buffer);
  }
}

This Blocking I/O style generally means…

linear programming
easier to code/understand
less control

See Portaudio’s blocking read/write and/or RtAudio’s Playback (blocking functionality).

Considerations for real-time audio programming

If you don’t know how long it will take, don’t do it. (Ross Bencina)

Here’s some more rules from Ross Bencina’s post:

…a few rules of thumb for code that executes in a real-time audio callback:
- Don’t allocate or deallocate memory
- Don’t lock a mutex
- Don’t read or write to the filesystem or otherwise perform i/o. (In case there’s any doubt, this includes things like calling printf or NSLog, or GUI APIs.)
- Don’t call OS functions that may block waiting for something
- Don’t execute any code that has unpredictable or poor worst-case timing behavior
- Don’t call any code that does or may do any of the above
- Don’t call any code that you don’t trust to follow these rules
- On Apple operating systems follow Apple’s guidelines
…a few things you should do where possible:
- Do use algorithms with good worst-case time complexity (ideally O(1) wost-case)
- Do amortize computation across many audio samples to smooth out CPU usage rather than using “bursty” algorithms that occasionally have long processing times
- Do pre-allocate or pre-compute data in a non-time-critical thread
- Do employ non-shared, audio-callback-only data structures so you don’t need to think about sharing, concurrency and locks

See Real-time computing.

Computing Concepts

I/O
- Internet/Cloud
- Wifi
- Wired Network
- Disk
- Solid-state Storage
- RAM
- Special Devices (Audio A/DAC)
Threads
- Creation and destruction
- OS Scheduler Priority
Buffering
- FIFO queue/buffer
- “wait free” queue
Complexity
- Best case
- Average case
- Worst case
- Murphy’s Law: Anything that can go wrong, will go wrong.

Physical/Mechanical systems such as musical instruments respond immediately. Humans expect this; Any system that does not “feel” immediate seems broken. Analog/Electrical systems respond at the speed of electrons through (perhaps complex) circuits which “feels” immediate. In order to make computer audio immediate, we need to use small block sizes and that increases the probability of glitches, so we must be especially careful to follow the guideline discussed above.

Also see The Economic Value of Rapid Response (1983).