MOSPALOSEP: A Platform for the Binaural Localization and Separation of Spatial Sounds using Models of Interaural Cues and Mixture Models

Joan Mouba
DAFx-2010 - Graz
In this paper, we present the MOSPALOSEP platform for the localization and separation of binaural signals. Our methods use short-time spectra of the recorded binaural signals. Based on a parametric model of the binaural mix, we exploit the joint evaluation of interaural cues to derive the location of each time-frequency bin. Then we describe different approaches to establish localization: some based on an energy-weighted histogram in azimuth space, and others based on an unsupervised number of sources identification of Gaussian mixture model combined with the Minimum Description Length. In this way, we use the revealed Gaussian Mixture Model structure to identify the particular region dominated by each source in a multi-source mix. A bank of spatial masks allows the extraction of each source according to the posterior probability or to the Maximum Likelihood binary masks. An important condition is the Windowed-Disjoint Orthogonality of the sources in the time-frequency domain. We assess the source separation algorithms specifically on instruments mix, where this fundamental condition is not satisfied.