Lab 7: Sound

Thu Nov 21 20:57:39 EST 2002

By the end of this lab, you will learn:

how sound is represented and processed in a computer
the implications of different sampling rates for sound files
how to use a sophisticated sound manipulation program (GoldWave)

You will be asked to experiment with sound that originates on an audio CD. If you have some favorite CD, bring it to the lab. Otherwise, you will be forced to impose on your friends or suffer the instructor's antediluvian tastes.

If you are doing this lab in your room, use a good sound system if possible; the built-in speakers in laptops are so bad you can't hear differences. If you have earphones, that's usually a good alternative.

In this lab we explore the manipulation of sound -- the other half of audio-visual media (we did graphics in an earlier lab). Sound is an integral part of many web pages; using the appropriate tools, you can turn your web site into a multimedia experience.

In this lab we also take time to explore how the amount of data saved affects the quality of a sound, and study the different qualities of sound available and the amount of storage required when using the different alternatives.

Some of the manipulation will be done with a first-rate program called Goldwave. If you do not have GoldWave on your machine, you can download it from the GoldWave web site; a local copy is also available. GoldWave is shareware; if you decide to keep it, you must pay for it.

Part 1: How sound is represented in a computer
Part 2: Recording your own sounds
Part 3: Sound manipulation with GoldWave
Part 4: Finishing up and shutting down

Part 1: How sound is represented in a computer

Sound is a varying air pressure wave, produced by all manner of sources, that is sensed by our ears. In an analog sound system, the pressure wave is captured by some kind of transducer (a microphone, most often) that produces an electrical voltage or current that varies proportionally to the sound pressure. This electrical signal can then be transmitted by telephone, or broadcast by radio, or preserved on magnetic tape (old audio cassettes) or used in other ways. En route, the sound might be processed to change its character in some way, for example to reduce noise or squeeze out unwanted or unneeded frequencies.

At the other end, the electrical signal is used to recreate the sound by vibrating some mechanical surface in a loudspeaker or an earphone, reproducing the original pressure wave (with varying degrees of fidelity) so that we can hear the sound.

The higher-pitched the sound, the more rapid the vibration. A pure tone is a regular sine wave like this:

while a more complicated sound is a composite of such waves, and might look like this:

This is reflected in the analog mechanisms used to preserve the sound. For example in old vinyl LP technology, a record has a long spiral groove whose shape is a representation of the sound waves it records. When the record is played, a fine needle follows the changes in the groove and creates an electric signal proportional to them; when amplified, this drives a loudspeaker.

Today, almost all sound systems are "digital", in the sense that the electrical signal from a transducer like a microphone is converted into a sequence of numeric values, proportional to the strength of the signal, and those numeric values are stored, transmitted, processed, etc., before ultimately being converted back into an electrical signal and then to sound. Thus, for example, an audio CD contains about an hour's worth of sampled voltage values (44100 samples per second in each of two stereo channels. Each sample is 16 bits, representing one of 65536 possible voltage values; multiplying this out gives about 650 MB for an hour).

The conversion from continuous analog electrical signals to discrete digital numeric values and back again presents a number of issues that we will explore in this lab:

How often do we sample? Sampling more often means that we can track the changes in a rapidly-varying signal more accurately. If we don't sample often enough, we miss meaningful changes in the signal.
How accurately do we sample? More precise signal measurements mean more accurate representation of the signal wave.
How much space do we need to store the sound? More frequent and more accurate samples means more data, which can add up very quickly indeed.
How much bandwidth (information carrying capacity) do we need to transmit the sound? It takes more communications capacity to preserve all the low and high frequencies that might be needed (for hi-fi sound, perhaps) or might be unnecessary (telephone quality speech, for instance).
How do we convert back from numbers to an analog waveform? If we intend to reproduce the original waveform with reasonable fidelity, we have to do this direction carefully as well.

Many of the sound systems we see are differentiated mainly in how they choose answers to these questions. This lab will explore some of the tradeoffs.

Analog processing is always lossy -- information leaks away and noise creeps in as analog sounds are processed and transmitted -- and the kinds of transformation of analog information that can be performed are quite limited. By contrast, once a set of numeric values has been obtained that capture a desired sound, those values can be copied, transmitted, processed, and stored in a rich variety of ways, without losing anything of the original. Thus for most purposes, digital sound is preferred.

One common digital process is compression: by taking advantage of how human hearing works, it is possible to compress sound data a great deal -- typically by a factor of ten -- without having a perceptible effect on its quality. This fact is at the heart of music formats like MP3, which are typically 10 times smaller than the equivalent uncompressed sound. We'll do some experiments with MP3 in this lab.

Analogously, we can take advantage of properties of human speech to compress telephone speech a great deal; this is used in cellphones. But the compression techniques that work for speech do not work well for music, as we'll try to demonstrate too.

Other digital transformations include speeding up or slowing down sound without changing its frequency, removing noise and other artifacts, and mixing sounds from multiple sources into a single one. We'll do a little bit of this in this lab.

Finally, it's possible to add carefully controlled redundancy to digital information that makes it possible to detect and even correct some kinds of errors; this is used extensively in digital sound, especially in audio CDs and in cellphones.

There's a lot on the web about sound. If you want to do some further reading, here is one clear description of how sound works, out of many.

Part 2: Recording your own sounds

Windows provides a handful of standard programs for sound, and you probably have added a few others on your own machine as well. One of the standards is CD Player, a simple program for playing audio CDs. It's usually found by Programs / Accessories / Entertainment on Win2K; it might be under Multimedia instead. We're assuming that you're familiar with it already, or that it's so obvious that no advice is needed.

The next program is Sound Recorder, which is likely to be in the same place. Sound Recorder can record sounds from a microphone or the CD player. (If you own a microphone, you may bring it in and attach it to the computer to record yourself. Modern laptop machines like those in the clusters have a built-in microphone; the challenge is to find it.) Sound Recorder lets you change the sound quality of a recording, and produce some limited effects.

We will begin by recording from the CD Player. Without closing Sound Recorder, open CD Player if it is not already opened. Now, select a song that you like. After hitting the play button on the CD Player, click the Record button on the Sound Recorder (the rightmost button with the red circle on it). If the computer is recording the music from the CD, you'll see sound waves appearing in the Sound Recorder window. If you don't see anything happening, ask a TA for help.

To stop recording, hit the stop button (the button with the square on it, adjacent to the record button). You can hit the record button any time in the song. To record certain sections of the song, you can hit the Stop button on the Sound Recorder to pause recording, and then the Record button again to continue the recording when the CD arrives at the point at which you wish to continue the recording.

After stopping the CD Player, hit the Rewind button on the Sound Recorder, and then the Play button, to play back what you have recorded. If you are not pleased with what was recorded, select File / New to start a new recording; the last recording will not be saved. To save a recording, go to File / Save As. With the default settings, the sound file is growing at about 22 kilobytes per second (KB/sec), so keep your recordings to a minute or less. We will eventually ask you to put a sound on an HTML page. Remember that the files for any sounds that you want accessible from your Web site must be in your public_html directory. But that directory has a quota, so watch out for extra-long recordings and extra-big files.

There have been reports that Sound Recorder on cluster machines goes through the motions but produces only static. If you find this is happening, you can either use GoldWave or another program like EZ CD Creator to copy from the CD instead; you will either have to experiment or ask one of the lab assistants for help. (With GoldWave, in the Device Controls window, select Properties (under the red "record" circle), then Device, then enable the CD.)

Copyright issues

Virtually all commercial CDs are copyrighted. This means that it is generally illegal for you to distribute copies of songs from your CDs. Making a copy of a song for your own personal, noncommercial use generally qualifies as "fair use" and therefore is allowed.

Very Important: Material on other web sites may be subject to copyright, and there are both legal and University restrictions on what you can do with copyrighted material, including images and sound. The law is evolving in this area and copyright holders are becoming aggressive in asserting their rights (and sometimes more than their rights). You should be aware of the University's policy on fair use of electronic materials.
In part, this says "15. With regard to material on the World Wide Web, if there is an image, a background pattern, a section of text or a musical, film or video selection which you would like to publish yourself on the Web (or elsewhere), you must first obtain permission of the owner or copyright-holder. You are free to establish links to Web pages you enjoy and which you would like to share with others. But you are not free to copy the work of others to publish yourself on the World Wide Web (or elsewhere) or to redistribute the work of others via servers, e-mail or other means without authorization and proper attribution."

You should be especially careful of images or sounds that might be of commercial value to their copyright owners. If a site has an explicit copyright notice, you should not copy anything from it.

Sound quality and file size

The sample rate (how many times per second the sound waves are sampled to make the digital representation), the number of bits per sample, and whether it is a mono or stereo sample determine the quality of a sound recording and the size of the file created to store the sound. By default, Sound Recorder makes a monophonic recording using 8 bits per sample and 22050 samples per second, giving files that grow about 22 KB/sec. There is a range of choices, going from significantly lower quality (mono, 8000 8-bit samples/second, or 7 KB/sec) to "CD Quality", which is a stereo recording using 16 bits per sample and 44,100 samples per second. These latter files grow at 172 KB/sec, or about 10 MB per minute, which is consistent with the sizes cited above for audio CDs.

To change the recording quality, use File / Properties; select "Recording formats" from "Choose from", then select "Convert Now". This opens another window called Sound Selection; select "44,100 kHz 16 Bit Stereo". Note that once you have recorded a file using a particular quality, you must choose "Properties" under the "File" menu to change the quality for that file, even if you want to completely record over the contents of the file.

Choose a 20 to 30 second segment of your CD to record. When judging sound quality, differences are often more noticeable on music than on voice, so take this into account in your selection.
Make three recordings of this segment: one at CD quality, one at the default (medium) quality, and one at the lowest quality, and save these in three files, called low.wav, medium.wav and high.wav. Please use these names!
Listen to the recordings. Can you hear any differences? How would you describe them? Look at the file sizes: you can certainly see the differences. Are the file sizes consistent with what is claimed?
You will submit the files for the recording in medium and low quality. Save the CD Quality file locally (e.g., on the Desktop) for now, since you'll be using it later, but you won't be submitting it, since it is large.

MIDI representation

There is widely-used representation for instrumental music called MIDI ("Musical Information Digital Interface") that does not store sound waves at all. Instead it stores a digital representation of the notes to be played, including what note, what instrument, and what duration. The resulting form is very compact and very flexible for many purposes: it's easy to transpose into a different key, to play on different instruments, and the like.

The device that is going to produce the sounds has to synthesize the sounds from an internal definition of what the sounds sound like. For example, a low-end synthesizer will typically have 64 "voices", representing the 64 different instrumental voices it can approximate. Depending on the device, the fidelity to a real instrument might be anywhere from very good to very bad. Typically, piano sounds are pretty good; the human voice is not, save as a sort of characterless choral effect.

MIDI is used in all kinds of synthesizers; pop bands are very fond of the kinds of sounds it produces. We won't do anything with MIDI in this lab, but it's worth knowing about. If you're interested in creating MIDI, there are software packages like Cakewalk and Noteworthy.

Part 3: Sound manipulation with GoldWave

GoldWave is an elaborate program that incorporates all of the features of the programs which come with Windows, but is able to produce even more special effects. In order to get the full experience from this lab, you should download a few different sounds that you find interesting. Here are a couple of Web sites that are not the usual KaZaa and Morpheus, but you're free to find others.

GoldWave handles a variety of sound formats, with Wave (.wav) and MP3 the most common. The "famous speeches" site seems to be mostly in .au format, another widely used encoding. GoldWave will also do sound recording, like Sound Recorder; the controls are on the device control window. You can do your recording this way if you prefer.

Now that you've found some sounds, let's take a look at GoldWave. You will find GoldWave under "Cluster Applications". The program has two windows, the main program window:

and the device controls:

The main window displays the waveforms from sound files you are working with and gives you tools to edit them. The device controls window is for playing sounds, recording, and the like. This display shows 33 seconds of the Aria from Bach's Goldberg Variations, played on a piano, in WAV format. For contrast, this is the same file in 8-bit mono, and just for comparison, here is a link to a very good MIDI version for harpishord.

Take a moment to become familiar with the main window in GoldWave. Notice that when you drag the mouse across the sound wave (which should be a flat line right now), it highlights it. This is how you will select areas of your recording to use special effects on. Now, look in the effects menu. Most of the options there are relatively intuitive. If you select an effect without first highlighting some of the sound bar, then it will automatically affect the entire recording. Finally, notice the buttons right underneath the View and Tools menus. These are perhaps the most important buttons in the program. The Undo button will allow you to undo the effects of a menu choice, but you can only go back one step. You can cut and paste highlighted segments of your sound file, just as you would a normal text file. These allow you to reposition various parts of your song.

Load your sound file into GoldWave (select Open from the File menu). Familiarize yourself with the Device Controls, then experiment.

GoldWave is capable of converting a file into any of a large number of other formats, in effect doing as output what Sound Recorder does as input. You can explore the available conversions by doing Save As..., or by File / Batch Conversion. Be careful not to overwrite your original file if the output format has the same extension as the original.

Make a second low-fi version of your original CD .wav file, using one of the 8-bit mono alternatives; save it in the file low2.wav. Try to find a format that's compact but still sounds acceptable. (The example above is not even very compact and is certainly not acceptable.)
Make a version of your original .wav file in MP3 format. MP3 is the compressed format that is used for almost all music on the Internet (and the conversion is the "ripping" process that you're probably familiar with as a precursor to burning a CD). MP3 is usually about 1/10 the size of the corresponding .wav file. Does your experience bear out this rule of thumb? Note that there are several variants of MP3. Pick the one that is closest to the original .wav encoding, probably 44100 stereo at 160 Kbps.
This may not work if your version of GoldWave does not have an MP3 converter installed (which seems to be true in the cluster systems). GoldWave will offer to help you install one. You can do this yourself: save the file lame_enc.dll in the folder C:\WINNT. Ask a TA for help if necessary. If you have a different ripper, it's fine to use that instead.
Make a 1 bit/sample version in file 1bitmusic.wav. This form of compression is what cellphones tend to do: it's tailored to human speech, and nothing else. How well does this handle music?
Convert a speech example from the famous speeches web site or some other source at 1 bit/sample. Save this as 1bitspeech.wav. How does this highly compressed speech sample compare in quality to the original? How much is it compressed?

The last part of the lab is to mix and match anything you like into a short sound file; the only requirement is that you do some mixing of sounds and use some special effects.

Create a 20 to 30 second sound file using GoldWave special Effects features, and save this file for submission. Your file should have at least two obviously different sources spliced together and one or more of GoldWave's "Effects". Bear in mind that you have a space quota on Arizona, so although you can make the file as big as you like for your own amusement, you can probably only store 500 KB to 1 MB publicly. You can use MP3 or a lower-quality encoding to keep the size manageable.

Here are some notes on the Effects menu:

Playback Rate: this determines the quality of the sound you will hear back. The higher the number, the better (but the more storage space it will take should you choose to save it).
Transpose: allows you to take your recording and make it start on different notes, thereby making it sound brighter or darker.
Doppler: changes the pitch of the selection. It presents you with a graph, where you can drag the sound line to indicate the points at which you would like the pitch to be high and low.
Reverse: makes the selection go backwards
Silence: eliminates any sound from the selected area

Part 4: Finishing up and shutting down

We'd like you to place each of the sound files that we asked you to save in your public_html directory and create a new HTML file (not your home page) with links to these sounds. Call this file soundlab.html.

Recall that sound links are created by typing

<A HREF = "filename.wav">This sound file</A> replacing "filename.wav" with the name of the desired file. We would also like you to put in the HTML file soundlab.html some text to explain what you have done. Here is a template for file soundlab.html that must be used to organize this information. You can download a copy by right-clicking on the link.

These are the files to which you should have links and the explanations you should include in the HTML file:

low.wav, medium.wav, low2.wav, the files for low and medium quality sound recording from the Sound Recorder and low quality sound from GoldWave. Include some text with each file link stating what format was used and the size in bytes of the file.
Your CD sound, converted to MP3, and its size. What is the compression ratio, i.e., by what factor is the original larger than the MP3?
Your speech excerpt in speech.wav or other format, and the 1 bit/sample version in 1bitspeech.wav. Again, what is the compression ratio?
Your original music sample, converted to 1 bit/sample in 1bitmusic.wav. What is the compression ratio?
The 20-second sound file effects.wav or effects.mp3 that you produced using GoldWave, and its size in bytes. Briefly, what did you do to create it?

Use the template soundlab.html for submission. When you're done, send email to cs109@princeton.edu or cs111@princeton.edu with subject "Lab 7 -- Your name".

If you saved anything on the Desktop that you want to preserve, be sure to transfer it to your public_html folder on arizona. And make sure the files are readable: we can't grade it if we can't read it.

If you've completed the lab, sent your email to cs109@princeton.edu or cs111@princeton.edu and transferred your work to your Unix account, then you are finished.