Download Bayesian Identification of Closely-Spaced Chords from Single-Frame STFT Peaks Identifying chords and related musical attributes from digital audio has proven a long-standing problem spanning many decades of research. A robust identification may facilitate automatic transcription, semantic indexing, polyphonic source separation and other emerging applications. To this end, we develop a Bayesian inference engine operating on single-frame STFT peaks. Peak likelihoods conditional on pitch component information are evaluated by an MCMC approach accounting for overlapping harmonics as well as undetected/spurious peaks, thus facilitating operation in noisy environments at very low computational cost. Our inference engine evaluates posterior probabilities of musical attributes such as root, chroma (including inversion), octave and tuning, given STFT peak frequency and amplitude observations. The resultant posteriors become highly concentrated around the correct attributes, as demonstrated using 227 ms piano recordings with −10 dB additive white Gaussian noise.
Download A New Score Function for Joint Evaluation of Multiple F0 Hypotheses This article is concerned with the estimation of the fundamental frequencies of the quasiharmonic sources in polyphonic signals for the case that the number of sources is known. We propose a new method for jointly evaluating multiple F0 hypotheses based on three physical principles: harmonicity, spectral smoothness and synchronous amplitude evolution within a single source. Given the observed spectrum a set of F0 candidates is listed and for any hypothetical combination among the candidates the corresponding hypothetical partial sequences are derived. Hypothetical partial sequences are then evaluated using a score function formulating the guiding principles in mathematical forms. The algorithm has been tested on a large collection of arti cially mixed polyphonic samples and the encouraging results demonstrate the competitive performance of the proposed method.
Download Sound Source Separation: Azimuth Discrimination and Resynthesis In this paper we present a novel sound source separation algorithm which requires no prior knowledge, no learning, assisted or otherwise, and performs the task of separation based purely on azimuth discrimination within the stereo field. The algorithm exploits the use of the pan pot as a means to achieve image localisation within stereophonic recordings. As such, only an interaural intensity difference exists between left and right channels for a single source. We use gain scaling and phase cancellation techniques to expose frequency dependent nulls across the azimuth domain, from which source separation and resynthesis is carried out. We present results obtained from real recordings, and show that for musical recordings, the algorithm improves upon the output quality of current source separation schemes.
Download Source Separation for WFS Acoustic Opening Applications This paper proposes a new scheme to reduce coding bit rate in array based multichannel audio applications like the acoustic opening, which can be used for modern teleconference systems. The combination of beamforming techniques for source separation and wave field synthesis allows a significant coding bit rate reduction. To evaluate the quality of this new scheme, both objective and subjective tests have been carried out. The objective measurement system is based on the Perceptual Audio Quality Measure of the binaural signal that the listener would perceive in a real environment.
Download Analysis of Certain Challenges for the Use of Wave Field Synthesis in Concert-Based Applications Wave Field Synthesis (WFS) provides a means for reproducing 3D sound fields over an extended area. Beyond conventional audio reproduction applications, present research at IRCAM involves augmenting the realism of concert-based applications in which real musicians will be interacting on stage with virtual sources reproduced by WFS. The stake of such a situation is to create virtual sound sources which behave as closely as possible to real sound sources, in order to obtain a natural balance between real and virtual sources. The goal of this article is to point out physical differences between real sound sources and WFS reproduced sources situated at the same position, considering successively the sound field associated to the direct sound of the virtual source and its interaction with the room. Methods for taking into account and compensating these differences are proposed.
Download A Maximum Likelihood Approach to Blind Audio De-Reverberation Blind audio de-reverberation is the problem of removing reverb from an audio signal without having explicit data regarding the system and/or the input signal. Blind audio de-reverberation is a more difficult signal-processing task than ordinary dereverberation based on deconvolution. In this paper different blind de-reverberation algorithms derived from kurtosis maximization and a maximum likelihood approach are analyzed and implemented.
Download Practical Implementation of the 3D Tetrahedral TLM Method and Visualization of Room Acoustics This paper concerns the implementation of a 3D transmission line matrix (TLM) algorithm based on a tetrahedral mesh structure and visualization of room acoustics simulation. Although a well known method, TLM algorithms implemented in 3D are less commonly found in the literature. We have implemented the TLM method using a tetrahedral mesh of pressure nodes with transmission lines lying superimposed on nearest neighbour bonds of a tetrahedral atomic lattice. Results of simulations are compared with those of a standard 3D cartesian mesh and a 2D mesh implementation of TLM. An important feature is a useful graphics interface designed for user-friendly control of room acoustics simulation and visualization in arbitrary shaped rooms containing objects of arbitrary size and number. The paper includes brief discussions of results of using different techniques for modeling totally absorptive or partially absorptive boundaries.
Download RoomWeaver: A Digital Waveguide Mesh Based Room Acoustics Research Tool RoomWeaver is a Digital Waveguide Mesh (DWM) based Integrated Development Environment (IDE) style research tool, similar in appearance and functionality to other current acoustics software. The premise of RoomWeaver is to ease the development and application of DWM models for virtual acoustic spaces. This paper demonstrates the basic functionality of RoomWeaver’s 3D modelling and Room Impulse Response (RIR) generation capabilities. A case study is presented to show how new DWM types can be quickly developed and easily tested using RoomWeaver’s built in plug-in architecture through the implementation of a hybrid-type mesh. This hybrid mesh is comprised of efficient, yet geometrically inflexible, finite difference DWM elements and the geometrically versatile, but slow, wave-based DWM elements. The two types of DWM are interfaced using a KW-pipe and this hybrid model exhibits a significant increase in execution speed and a smaller memory footprint than standard wave-based DWM models and allows nontrivial geometries to be successfully modelled.
Download Real Time Modeling of Acoustic Propagation in Complex Environments In order to achieve high-quality audio-realistic rendering in complex environments, we need to determine all the acoustic paths that go from sources to receivers, due to specular reflections as well as diffraction phenomena. In this paper we propose a novel method for computing and auralizing the reflected as well as the diffracted field in 2.5D environments. The method is based on a preliminary geometric analysis of the mutual visibility of the environment reflectors. This allows us to compute on the fly all possible acoustic paths, as the information on sources and receivers becomes available. The construction of a beam tree, in fact, is here performed through a look-up of visibility information and the determination of acoustic paths is based on a lookup on the computed beam tree. We also show how to model diffraction using the same beam tree structure used for modeling reflection and transmission. In order to validate the method we conducted an acquisition campaign over a real environment and compared the results obtained with our real-time simulation system.
Download Decorrelation Techniques for the Rendering of Apparent Sound Source Width in 3D Audio Displays The aim of this paper is to give an overview of the techniques and principles for rendering the apparent source extent of sound sources in 3D audio displays. We mainly focus on techniques that use decorrelation as a mean to decrease the Interaural CrossCorrelation Coefficient (IACC) which has a direct impact on the perceived source extent. We then present techniques where decorrelation is varied in time and frequency, allowing to create tempo ral and spectral variations in the spatial extent of sound sources. Frequency dependant decorrelation can be employed to create an effect where a sound is spatially split in its different frequency bands, these having different positions and spatial extents. We fi nally present results of psychoacoustic experiments aimed at eval uating the effectiveness of decorrelation based techniques for the rendering of sound source extent. We found that the intended source extent matches well the mean perceived source extent by subjects.