A Simple and Effective Spectral Feature for Speech Detection in Mixed Audio Signals

Reinhard Sonnleitner; Bernhard Niedermayer; Gerhard Widmer; Jan Schlüter
DAFx-2012 - York
We present a simple and intuitive spectral feature for detecting the presence of spoken speech in mixed (speech, music, arbitrary sounds and noises) audio signals. The feature is based on some simple observations about the appearance, in signals that contain speech, of harmonics with characteristic trajectories. Experiments with some 70 hours of radio broadcasts in five different languages demonstrate that the feature is very effective in detecting and delineating segments that contain speech, and that it also seems to be quite general and robust w.r.t. different languages.
Download