• 0Visitors currently online:
  • 9689Reads per month:
  • 5910Visitors per month:
  • 1166175Total reads:
  • 774347Total visitors:


Basic Definitions of Digital Audio

Anther great tutorial by Gary, on the audio, this time a bit of a deeper look at at it’s workings

Basic Definitions of Digital Audio

Sound Forge Pro 10Sony Creative Software has deep roots in the audio world. I’ve talked about audio topics many times in previous articles that have appeared in this column and we’ve taken a look at many features of our software that make working with audio more powerful and efficient. However, from time to time I talk to people who are not entirely clear on the concepts behind digital audio. In this article we’ll discuss a couple basic concepts of digital audio, and also spend a few moments talking about audio channels. Much of this may be rudimentary for many of you who have been working with digital audio for a while, but for those who are relatively new to the concepts, this article will shed some light on a topic that can otherwise seem mysterious. Armed with this information, you’ll be able to make informed decisions about how you want to treat the audio in your projects, whether you’re recording a new album of original music or shaping the audio track of your next video or blockbuster film.

I’d like to focus specifically on Sound Forge™ Pro since that application (available both on the PC and the Mac) has established itself as an essential tool for countless audio professionals. However, the general concepts of digital audio are the same regardless of which application you use.

Sound Forge Pro enables you to manipulate data to change the sound of audio recorded digitally to a computer. When people ask what Sound Forge Pro does, I usually start out by saying that you can think of Sound Forge Pro as a word processor for audio files. In fact, you use many of the same techniques to edit a document in a word processor — like cut, copy, and paste — to edit audio in Sound Forge Pro. Other people describe Sound Forge Pro as the utility knife of digital audio because there are so many different things you can do with the software. Basically, if there’s something you want to do with or to audio, there’s a great chance that Sound Forge Pro can do it.

To get a bit more technical in our description, Sound Forge Pro is a multi-channel digital-audio editor. The first part of the definition, “multi-channel,” means that Sound Forge Pro for the PC can create and edit sound files that have from one channel (a mono file) to 32 channels. Each channel in the file is shown as a separate waveform. For instance, in a stereo file, you see two audio waveforms. The waveform on the top represents the audio that comes out of channel #1 (the left channel) and thus, out of your left speaker. The waveform on the bottom represents the audio that comes out of channel #2 (the right.) Mono files contain only one channel of audio that Sound Forge Pro sends simultaneously to both speakers. Figure 1 shows a stereo file in Sound Forge Pro on the PC.

Figure 1: Click for a larger view

Figure 1: A stereo file open in Sound Forge Pro on the PC.

The next part of the definition, “digital audio,” refers to audio — or sound — that you store in the form of a digital file on a computer drive or other digital media — like a compact disc — rather than in a traditional analog format such as cassette tape or vinyl record. In other words, digital audio is just a bunch of 1s and 0s that get translated into a reproduction of the original audio by a computer or CD player.

Some digital audio sounds fantastic, like the music on your favorite commercial CD. But it doesn’t all sound that good. The difference lies in how we represent the audio information digitally. Two concepts in particular — sampling rate and bit depth — play major roles in the fidelity, or quality, of digital audio.

To record digital audio, Sound Forge Pro stores digital representations — known as samples — of what the audio sounds like at different points in time. Thesampling rate is the number of samples taken each second. Higher audio sampling rates provide more detail in describing the sound. This greater detail comes across as higher frequencies or pitches that result in higher audio fidelity.

To illustrate all of this, consider the picture of an analog waveform in Figure 2. When we’re recording digitally, we want to record with a high enough sampling rate that we can accurately reproduce the shape of the waveform.

Figure 2: Click for a larger view

Figure 2: A simple sine wave

Most of us are familiar with the shape of a simple sine wave like this one. Now let’s take a look at a few other audio files in Sound Forge Pro to see the difference that sampling rate makes on the shape of the waveform. Figure 3 shows three digital representations of the simple sine wave we saw in Figure 2. In each case, I’ve zoomed into the sine wave as far as possible in Sound Forge Pro.

Figure 3: Click for a larger view

Figure 3: The waveforms for the same file look drastically different at three different sampling rates.

In the top data window, I created the sine wav at a very low sampling rate of just 8,000 samples per second (or 8,000 Hz). Since this is a sine wave, technically it should look like the sine wave graphic we saw in Figure 2. However, you can easily see the inaccuracy of the sine wave at this low sampling rate. In fact, it’s difficult to recognize this waveform as a sine wave at all. Note that the dots you see along the waveform line represent individual samples.

The file in the middle data window was created at a much higher sample rate of 44,100 samples per second. Notice that this line contains far more samples and, as a result, looks much more like the smooth waveform graphic we saw in Figure 2, though it’s clearly still not perfect. Even though under this microscopic scrutiny the file still looks quite inaccurate, the human ear can’t hear these inaccuracies. In fact, this sampling rate is the standard rate for audio CDs because it represents an acceptable compromise between accuracy and file size which, by the way, grows with the sampling rate.

Finally, the bottom file was created at a sample rate of 96,000 samples per second. You can see that the waveform looks like a very accurate representation of a smooth sine wave. The higher the audio sample rate, the more accurate the audio reproduction.

If you were to listen to each of these files, you’d notice that even though the waveforms look drastically different, they sound identical. Because the nature of a sine wave is such that you hear just one frequency, and the frequency of these files happens to be perfectly reproducible even with a sample rateas low as 8,000, you would hear no difference between the three files.

However, it’s a completely different story when the audio file contains multiple frequencies, such as in a music file. In such a case, the file recorded with a sampling rate of only 8,000 samples per second would sound dull and lifeless. You would notice a lack of distinct high-pitched sounds, like the crisp upper frequencies of the cymbals for instance.

However, you might be surprised when you compare a 44,100 Hz file with the same audio at a sampling rate of 96,000 Hz because you likely wouldn’t be able to hear a quality difference between the two. Remember, 44,100 Hz represents the point beyond which you can no longer hear the difference in a digital file of higher sampling rate.

So why use a sampling rate higher than 44,100 Hz? In short, a higher sampling rate allows high frequencies to be captured, retained, and reconstructed more accurately.

Remember though, better fidelity comes at a cost — larger file size. For example, my file weighs in at three megabites (3 MB) with a sampling rate of 8,000. The same file recorded at 44,100 samples per second is over 16 megabites (16.6 MB). You pay for the better quality with a much larger file size.

Sampling rate is one half of the dynamic duo of digital audio. The other main component — bit depth — goes hand in glove with another important issue known as quantization. Bit depth refers to the number of bits that the software uses to define the value of a single sample. A bit is the smallest unit of measurement used by a computer. Bits are binary, which means that each bit has two possible values: 0 or 1. The more bits you string together in defining a digital sample, the more accurate that sample will be.

Quantization is the process of rounding the actual value of the audio waveform to the nearest value which the sample can describe given the bit depth. The process of quantization introduces errors into the system which we perceive as noise in the reproduced audio. These errors can be reduced — though never completely eliminated — with higher bit-depth samples. When the bit depth is high enough — about 16 bit — quantization errors and the noise they create become imperceptible to our ears. That’s why CD-quality recordings use a bit depth of 16.

Here’s a simplified analogy. Suppose you want to measure a line and you have a ruler with four measurement marks on it, like the one in Figure 4. We’ll call it a two-bit ruler because at two bits there are only four possible values. In the binary language of the computer those values are 00, 01, 10, and 11. If the line falls between values 10 and 11, we need to round the value — or quantize it — to the nearest measurement. We’ve accomplished our task of measuring the line, but we’ve got quite a bit of easily perceptible error in our measurement.

Figure 4: Click for a larger view

Figure 4: A two-bit ruler has only four possible values

Now suppose we use a four-bit ruler like the one in Figure 5. This time we have 16 possible values with which to describe the length of the line. Our quantization error is much less perceptible. Carrying this further, with a 16-bit ruler we’d have 65,536 measurement values to use, and with a 24-bit ruler we’d have 16,777,216 levels. You can see that as the bit depth increases, we can measure our line or describe our audio sample much more accurately.

Figure 5: Click for a larger view

Figure 5: A four-bit ruler gives us much more measuring accuracy than a two bit ruler

Our applications generally support bit depths up to 64-bit (float), so you can achieve far-better-than-CD-quality audio resolution with the application. However, just like with higher sampling rates, higher bit depths use more disc space.

Again, if CD-quality is 44,100 samples per second at 16 bit, then why would you ever want to use higher bit depth? Basically the extra bit depth means lower quantization requirements and that equates to less noise. Quantization noise may not start out audible, but audio filters and effects that you apply to your file may accentuate that noise. Higher bit depths enable you to add processing and effects to your project and maintain the integrity of the sound in the course of the calculations. But remember, you pay with increased file size so you’ll have to determine for yourself if it’s a good bargain or not.

One thing to remember about using files with higher sampling rates and bit depths: eventually you’ll probably need to deliver the file in a format that uses lower values for both of these components. For example, if you’re burning your file to a CD you’ll have to resample to 44,100 and drop the bit depth to 16, both of which are the specifications for CD. Our software will make these changes for you during the saving or rendering process, but to preserve the highest quality possible you should bring the file into Sound Forge Pro and run the Bit Depth Converter and Resample processes on it. Use those processes to reconfigure your file and then save it to the format and quality you need.

Hopefully this basic discussion of channels, sampling rate, and bit depth has helped to demystify these topics for you. I hope you now have a better understanding of these settings in your projects and why you might want to consider using higher settings for them. For more training — including archives of all of our articles, free tutorial videos, free webinar archives, and more, visit our training page at

Gary RebholzGary Rebholz, is the training manager for Sony Creative Software. Gary produces the popular Seminar Series training packages for Vegas Pro, ACID Pro, and Sound Forge software. He is also co-author of the book Digital Video and Audio Production. Gary has conducted countless hands-on classes in the Sony Creative Software training center, as well as at tradeshows such as the National Association of Broadcasters show.


Post to Twitter

Leave a Reply

* Checkbox GDPR is required


I agree

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>




This site uses Akismet to reduce spam. Learn how your comment data is processed.