Excitation Codebook Design for Efficient Coding of the Singing Voice

Youngmoo E. Kim

To appear at Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA01), Mohonk Mountain Resort, NY, 21-24 October 2001


The technique of Code Excited Linear Prediction (CELP) has lead to the development of voice coding systems that provide toll quality speech at very low bitrates. While speech and singing share many similarities in terms of production, standard speech coding implementations fall far short when transmitting the singing voice. This paper explores the reasons for this discrepancy and suggests new variations on CELP speech coders that specifically enhance the quality of transmitted singing. From this, a low-bitrate singing voice codec is introduced that could be used in conjunction with multi-track structured coding schemes, such as MPEG-4 Structured Audio, which can provide a highly compressed yet high quality representation of a complete song.

