Speech Segregation Based on Pitch Tracking and Amplitude Modulation

Guoning Hu and DeLiang Wang

To appear at Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA01), Mohonk Mountain Resort, NY, 21-24 October 2001


Speech segregation is an important task of auditory scene analysis (ASA), in which the speech of a certain speaker is separated from other interfering signals. Wang and Brown proposed a multistage neural model for speech segregation, the core of which is a two-layer oscillator network. In this paper, we extend their model by adding further processes based on psychoacoustic evidence to improve the performance. These processes include pitch tracking and grouping based on amplitude modulation (AM). Our model is systematically evaluated and compared with the Wang-Brown model, and it yields significantly better performance.

Server START Conference Manager
Update Time 5 Jul 2001 at 16:08:05
Maintainer malcolm@ieee.org.
Start Conference Manager
Conference Systems