Gaming Literacy: An introduction to NES sound and 8-bit sound illusions

Music in modern games is simultaneously far more complex and much simpler than music was in the 8-bit era. From a technical standpoint, modern music technology is far more advanced than 8-bit music technology… obviously. It doesn’t take much to see how a full orchestral score is more complex than the bleeps and bloops of the NES. But from a composition stand point, it’s actually much simpler.

All we do these days is play some music, record it, and play it back. Granted, that’s a vast oversimplification. It takes a lot of sill to master and edit a professional recording, but the fact is there isn’t much difference in recording a track for an album or movie soundtrack than there is in recording a track for a video game.

This wasn’t always the case. In the early days of digital music computers weren’t advanced enough to play back high quality sound files. Instead they had sound chips and these chips acted like their own unique instruments.

The Basics

Think back to the days of cheap Casio keyboards. These devices also used sound chips. These keyboards were able to produce a certain number of different tones at once. If you mashed on all the keys at once, you actually only heard 3 or 4 notes. That’s because these sound chips have a finite number of voices or channels. Older sound chips could only produce sound per channel, so if a keyboard only had three channels you would only ever be able to hear three notes at once.

The same holds true for the sound chips used in older consoles. The NES had five channels and each was locked to a very specific type of tone. The first two channels were able to produce square sound waves. The third was able to produce triangle sound waves. The fourth was able to produce noise, essentially static. Finally, the fifth channel was a rarely used channel that was able to utilize small audio samples.

You’ve probably also heard of keyboards with sampling capability. They allowed you to record any sound you want and translated that into notes you could play on the keyboard. While our immature 5-year old selves used this capability to play songs in the key of “fart,” the NES used this to supplement its existing sound waves with more complex instrumentation.

New Wave

If terms like “square wave” and “triangle wave” sound confusing to you, don’t worry. They are essentially just different types of sound waves. Think of them as the different sorts of instruments the NES sound chip can play. Square waves are the characteristic beeps and boops that have become associated with video game music. Triangle waves are a bit more muffled and handle bass tones better than square waves do. The NES could also adjust the duty cycle of its square wave, allowing it to produce rough and nasally sounding tones as well.

For a more in-depth look at what these waves are and why they sound the way they sound, check out the video by ENJ Music below.

Still, these limited voices weren’t a lot to work with. NES fans are probably swearing that they heard more than four instruments in their favorite chiptune tracks. This was because early game composers were not just incredible musicians but masters of audio illusion. They used the limited tools the sound chip granted them to make their compositions seem more full and complex than they actually were.

Here’s an example from the original Legend of Zelda composed by Koji Kondo.

Examine the square waves at the top of the screen and you’ll notice two things. First, they never stay still. The frequency and amplitude of these waves is constantly shivering back and forth. This turns the harsh “beep” sound of the square wave into something a little more reminiscent of a flute or woodwind instrument. It’s the exact same technology that is used to produce beeps, but the fluctuating tone makes it register in our brain as something other than a beep.

We can also see how Koji Kondo overlapped the two square wave channels to create an echo effect early in the composition. This makes the sound of his “wind” instruments feel fuller. Later in the composition he overlaps one square wave channel with the triangle wave channel which is being used for the bass line. This way, when the second square wave drops its duties as bass accompaniment to play the second part of the melody, the listener barely notices.

Here we have a much more complex example, the Solstice title theme by Tim Follin.

In the beginning of the composition we see the aforementioned duty cycle modulation turning the normally beepy square wave into something that sounds more like a trumpet. After the music “kicks in” so to speak, we hear something that Tim Follin was known for, the arpeggio effect. By playing arpeggios quickly in a square wave channel, he turns the normal sound of the square wave into something closer to a string instrument, a harp or guitar in this case.

We also get to see the second square wave channel turned almost into a spike using duty cycle modulation. This combined with the reverb effect we saw in the previous example gives it an eerie ghostly vibe, almost sounding like a violin. This same technique would be used in Mother to form its array of creepy alien noises.

At around the 37-second mark we see the first square wave channel begin quickly alternating between two notes at once to once again mimic a string instrument being plucked. Meanwhile the second square wave channel alternates duty cycles to change instruments between a flute and trumpet. We also get to see the noise channel spike in amplitude and slowly fade out to mimic bombastic symbol crashes.

Shorter noise spurts are used to simulate drum-like percussion afterward.  At around 2:18 we get to see another technique that makes the second square wave channel sound like a string instrument. The tone is changed just for just a fraction of a second before holding on a note, once again simulating the sound of a plucked string.

Finally let’s look at this example from Super Mario Bros. 3.

The video shows a full memory visualization of the game while being run, but you can see the sound waves being emitted by the sound chip on the bottom. Here you can see a common technique used by games that have many sound effects. One square wave is in charge of the main melody while another square wave is in charge of an accompaniment. However, this fills all the sound channels and leaves no room for sound effects.

In this case, sound effects are mapped to the same channel as the accompaniment which is played at a lower volume than the sound effects themselves. When a sound effect is triggered the accompaniment fades out but fades back in as soon as it’s over. This makes it unlikely that the player will notice that the accompaniment is missing every time Mario jumps.

We also get to see the sample track put to use. In this case the sample in question is that of a steel drum. It’s heard clearly in the background during early levels. However, you can also see how a softer pitch shifted version of the drum is used for bass percussion in castles and tank stages.

These are just a few of the techniques used to make a simple sound chip capable of making 4-5 different tones sound like a whole orchestra to our ears. Similar techniques were used in home computer gaming and in the sixteen bit era of consoles. However, the 32 bit-era and CD-ROM gaming brought us CD quality music.

Synth techniques like this were abandoned for higher quality recorded music tracks. That’s not to say that the art of the chiptune is lost however. Several bands still make chiptune music today and throwback games, like Shovel Knight, embrace the chiptune sound even if they aren’t necessarily working within the NES’s limitations.