The WebAudio API

The Web Audio API is a high-level JavaScript API for processing and synthesizing audio in web applications. The goal of this API is to include capabilities found in modern game audio engines and some of the mixing, processing, and filtering tasks that are found in modern desktop audio production applications.

Initializing an Audio Context

An AudioContext is for managing and playing all sounds. To produce a sound using the Web Audio API, create one or more sound sources and connect them to the sound destination provided by the AudioContext instance. This connection doesn't need to be direct, and can go through any number of intermediate AudioNodes which act as processing modules for the audio signal. This routing is described in greater detail at the Web Audio specification.

A single instance of AudioContext can support multiple sound inputs and complex audio graphs, so we will only need one of these for each audio application we create. Many of the interesting Web Audio API functions such as creating AudioNodes and decoding audio file data are methods of AudioContext.

The following snippet creates an AudioContext:

var context;
window.addEventListener('load', init, false);
function init() {
  try {
    // Fix up for prefixing
    window.AudioContext = window.AudioContext||window.webkitAudioContext;
    context = new AudioContext();
  }
  catch(e) {
    alert('Web Audio API is not supported in this browser');
  }
}

For WebKit- and Blink-based browsers, you currently need to use the webkit prefix, i.e. webkitAudioContext.

Loading sounds

The Web Audio API uses an AudioBuffer for short- to medium-length sounds. The basic approach is to use XMLHttpRequest for fetching sound files.

The API supports loading audio file data in multiple formats, such as WAV, MP3, AAC, OGG and others. Browser support for different audio formats varies.

The following snippet demonstrates loading a sound sample:

var dogBarkingBuffer = null;
// Fix up prefixing
window.AudioContext = window.AudioContext || window.webkitAudioContext;
var context = new AudioContext();

function loadDogSound(url) {
  var request = new XMLHttpRequest();
  request.open('GET', url, true);
  request.responseType = 'arraybuffer';

  // Decode asynchronously
  request.onload = function() {
    context.decodeAudioData(request.response, function(buffer) {
      dogBarkingBuffer = buffer;
    }, onError);
  }
  request.send();
}

The audio file data is binary (not text), so we set the responseType of the request to 'arraybuffer'.

Once the (undecoded) audio file data has been received, it can be kept around for later decoding, or it can be decoded right away using the AudioContext decodeAudioData() method. This method takes the ArrayBuffer of audio file data stored in request.response and decodes it asynchronously (not blocking the main JavaScript execution thread).

When decodeAudioData() is finished, it calls a callback function which provides the decoded PCM audio data as an AudioBuffer.

Playing sounds

Once one or more AudioBuffers are loaded, then we're ready to play sounds. Let's assume we've just loaded an AudioBuffer with the sound of a dog barking and that the loading has finished. Then we can play this buffer with the following code.

// Fix up prefixing
window.AudioContext = window.AudioContext || window.webkitAudioContext;
var context = new AudioContext();

function playSound(buffer) {
  var source = context.createBufferSource(); // creates a sound source
  source.buffer = buffer;                    // tell the source which sound to play
  source.connect(context.destination);       // connect the source to the context's destination (the speakers)
  source.start(0);                           // play the source now
                                             // note: on older systems, may have to use deprecated noteOn(time);
}

This playSound() function could be called every time somebody presses a key or clicks something with the mouse.

The start(time) function makes it easy to schedule precise sound playback for games and other time-critical applications. However, to get this scheduling working properly, ensure that your sound buffers are pre-loaded. (On older systems, you may need to call noteOn(time) instead of start(time).)

Abstracting the Web Audio API

Of course, it would be better to create a more general loading system which isn't hard-coded to loading this specific sound. There are many approaches for dealing with the many short- to medium-length sounds that an audio application or game would use.

The following is an example of how you can use the BufferLoader class. Let's create two AudioBuffers; and, as soon as they are loaded, let's play them back at the same time.

window.onload = init;
var context;
var bufferLoader;

function init() {
  // Fix up prefixing
  window.AudioContext = window.AudioContext || window.webkitAudioContext;
  context = new AudioContext();

  bufferLoader = new BufferLoader(
    context,
    [
      '../sounds/hyper-reality/br-jam-loop.wav',
      '../sounds/hyper-reality/laughter.wav',
    ],
    finishedLoading
    );

  bufferLoader.load();
}

function finishedLoading(bufferList) {
  // Create two sources and play them both together.
  var source1 = context.createBufferSource();
  var source2 = context.createBufferSource();
  source1.buffer = bufferList[0];
  source2.buffer = bufferList[1];

  source1.connect(context.destination);
  source2.connect(context.destination);
  source1.start(0);
  source2.start(0);
}

Changing the volume of a sound

One of the most basic operations you might want to do to a sound is change its volume. Using the Web Audio API, we can route our source to its destination through an GainNode in order to manipulate the volume.

This connection setup can be achieved as follows:

// Create a gain node.
var gainNode = context.createGain();
// Connect the source to the gain node.
source.connect(gainNode);
// Connect the gain node to the destination.
gainNode.connect(context.destination);

After the graph has been set up, you can programmatically change the volume by manipulating the gainNode.gain.value as follows:

// Reduce the volume.
gainNode.gain.value = 0.5;

Applying a simple filter effect to a sound

The Web Audio API lets you pipe sound from one audio node into another, creating a potentially complex chain of processors to add complex effects to your soundforms.

One way to do this is to place BiquadFilterNodes between your sound source and destination. This type of audio node can do a variety of low-order filters which can be used to build graphic equalizers and even more complex effects, mostly to do with selecting which parts of the frequency spectrum of a sound to emphasize and which to subdue.

Supported types of filters include:

And all of the filters include parameters to specify some amount of gain, the frequency at which to apply the filter, and a quality factor. The low-pass filter keeps the lower frequency range, but discards high frequencies. The break-off point is determined by the frequency value, and the Q factor is unitless, and determines the shape of the graph. The gain only affects certain filters, such as the low-shelf and peaking filters, and not this low-pass filter.

Let's setup a simple low-pass filter to extract only the bases from a sound sample:

// Create the filter
var filter = context.createBiquadFilter();
// Create the audio graph.
source.connect(filter);
filter.connect(context.destination);
// Create and specify parameters for the low-pass filter.
filter.type = 0; // Low-pass filter. See BiquadFilterNode docs
filter.frequency.value = 440; // Set cutoff to 440 HZ
// Playback the sound.
source.start(0);

In general, frequency controls need to be tweaked to work on a logarithmic scale since human hearing itself works on the same principle (that is, A4 is 440hz, and A5 is 880hz).

Lastly, note that the sample code lets you connect and disconnect the filter, dynamically changing the AudioContext graph. We can disconnect AudioNodes from the graph by calling node.disconnect(outputNumber). For example, to re-route the graph from going through a filter, to a direct connection, we can do the following:

// Disconnect the source and filter.
source.disconnect(0);
filter.disconnect(0);
// Connect the source directly.
source.connect(context.destination);

Source: Web Audio API by Boris Smus, Advanced Sound for Games and Interactive Apps, O'Reilly Media March 2013

Further links