Non-linear One Pole Filter Application:
The Envelope Follower

A very useful algorithm for HCI and other signal detection applications is the Envelope Follower.

First the signal is rectified (absolute value) or squared (energy vs. power), then a special one pole low-pass smoothing filter is applied. The idea is to track low frequency amplitude peaks accurately while filtering out high frequencies. The way this is accomplished is with a signal dependent filter pole and gain. The figure below shows this:

If we choose b_up = 0.5 or so, and b_down = 0.99, we'll get a fast rising response and a slow falling response. The slow falling response is what filters out the high frequency information we don't want. The result of envelope following on a quasi-periodic impulsive signal (the sound of footsteps) is shown here:

A typical application might be for EMG data. The EMG envelope signal is a good estimate of overall muscle tension.

X. Using DSP to Classify Signals

This is a pretty dumb example of a useful function that decides if a chunk of audio is silence, noisy (a consonant), or pitched (a vowel).

The first block computes the power in the current chunk of the signal (sum of squares of the samples, divided by the number of samples in the chunk).

If the power is less than some threshold (how do we set that??), then the chunk is judged to be silent.

If not, then the second block computes the number of "zero-crossings" in the chunk (a measure of the high-frequency) energy.

If that number is greater than some threshold (again, how do we set this one?), then the chunk is judged to be a consonant, or noisy.

If not, then it is a vowel, or pitched (or low frequency noise??).

Here is some original speech.

Here are Consonants separated by the above algorithm.

Here are Vowels separated by the above algorithm.

Here's some code that does this on soundfiles.
You need both files, and you need to edit waveio.h for your architecture.
waveio.h
ZC-VCDet.c (Float, True Power Version)
ZC-VCDetInt.c (Integer, Fake Power (no multiplies) version).
But what if we wanted to do more, like figure out if it's speech or music in the first place? Or figure out whether it's a male or female speaker, or if they're angry, or other?
We need more complex tools for doing this type of classification. That, Little Adam, is another story.

* Permission to make digital or hard copies of part or all
of this work for personal or classroom use is granted with
or without fee provided that copies are not made or dis-
tributed for profit or commercial advantage and that copies
bear this notice and full citation on the first page. To copy
otherwise,to republish, to post on services, or to redistribute
to lists, requires specific permission and/or a fee. Ă

Back to Princeton HCI Page