Download GUI front-end for spectral warping
This paper describes a software tool developed in the Java language to facilitate time and frequency warping of audio spectra. The application utilises the Java Advanced Image Processing (AIP) API which contains classes for image manipulation and, in particular, for non-linear warping using polynomial transformations. Warping of spectral representations is fundamental to sound processing techniques such as sound transformation and morphing. Dynamic time warping has been the method of choice for many implementations of temporal and spectral alignment for morphing. This tool offers greater advantage by providing an interactive approach to warping, thus allowing greater flexibility in achieving a desired transformation. This application can then be used as input to a signal synthesis routine, which will recover the transformed sound.
Download Extracting automatically the perceived intensity of music titles
We address the issue of extracting automatically high-level musical descriptors out of their raw audio signal. This work focuses on the extraction of the perceived intensity of music titles, that evaluates how energic the music is perceived by listeners. We present here first the perceptive tests that we have conducted, in order to evaluate the relevance and the universality of the perceived intensity descriptor. Then we present several methods used to extract relevant features used to build automatic intensity extractors: usual Mpeg7 low level features, empirical method, and features automatically found using our Extractor Discovery System (EDS), and compare the final performances of their extractors.
Download Perceptual evaluation of weighted multi-channel binaural format
This paper deals with perceptual evaluation of an efficient method for creating 3D sound material on headphones. The two main issues of the classical two-channel binaural rendering technique are computational cost and individualization. These two aspects are emphasized in the context of a general-purpose 3D auditory display. The multi-channel binaural synthesis tries to provide solutions. Several studies have been dedicated to this approach where the minimum-phase parts of the Head-Related Transfer Functions (HRTFs) are linearly decomposed in the purpose of achieving a separation of the direction and frequency variables. The present investigation aims at improving this model, making use of weighting functions applied to the reconstruction error, in order to focus modeling effort on the most perceptually relevant cues in the frequency or spatial domain. For validating the methodology, a localization listening test is undertaken, with static stimuli, using a reporting interface which allows a minimization of interpretation errors. Beyond the optimization of the binaural implementation, one of the main questions addressed by the study is the search for a perceptually relevant definition of a reconstruction error.
Download Extraction of the excitation point location on a string using weighted least-square estimation of a comb filter delay
This paper focuses on the extraction of the excitation point location on a guitar string by an iterative estimation of the structural parameters of the spectral envelope. We propose a general method to estimate the plucking point location, working into two stages: starting from a measure related to the autocorrelation of the signal as a first approximation, a weighted least-square estimation is used to refine a FIR comb filter delay value to better fit the measured spectral envelope. This method is based on the fact that, in a simple digital physical model of a plucked-string instrument, the resonant modes translate into an all-pole structure while the initial conditions (a triangular shape for the string and a zero-velocity at all points) result in a FIR comb filter structure.
Download A non-linear technique for room impulse response estimation
Most techniques used to estimate the transfer function (or impulse response) of an acoustical space operate along similar principles. A known, broadband signal is transmitted at one point in the room whilst being simultaneously recorded at another. A matched-filter is then used to compress the energy in the transmission waveform in time, forming an approximate impulse response. Finally, equalisation filtering is used to remove any colouration and phase distortion caused by the non-uniform energy-spectrum of the transmission and/or the non-ideal response of the loudspeaker/microphone combination. In this paper, the limitations of this conventional technique will be highlighted, especially when using low-cost equipment. An alternative, non-linear deconvolution technique is proposed which will be shown to give superior performance when using non-ideal equipment.
Download Sound spatialization based on fast beam tracing in the dual space
This paper addresses the problem of geometry-based sound reverberation for applications of virtual acoustics. In particular, we propose a novel method that allows us to significantly speed-up the construction of the beam tree in beam tracing applications, by avoiding space subdivision. This allows us to dynamically recompute the beam tree as the sound source moves. In order to speedup the construction of the beam tree, we determine what portion of which reflectors the beam “illuminates” by performing visibility checks in the “dual” of the geometric space.
Download Software for the simulation, performance, analysis and real-time implementation of wave field synthesis systems for 3D-audio
Wave Field Synthesis is a method for 3D sound reproduction, based on the precise construction of the desired wave field by using an array of loudspeakers. The main purpose of this work is to present a set of software tools that brings to the audio community a feasible an easy way to start working with wave field synthesis systems. First in the paper, an introduction to different 3D sound techniques and an overview of WFS theory and foundations are given. Next, a series of software tools specially developed to simulate, analyze and implement WFS systems are presented. The first software module helps the user in the design of the array of loudspeakers to be employed in the reproduction by computing the equations for each speaker signal excitation. Another tool simulates the wave field generated by the arrays and analyses both performance and quality of the acoustic field. Finally a user friendly tool for realtime convolution capable of producing the excitation signals for the array of loudspeakers is presented. Also, different experiments that have been carried out with this software in order to evaluate the precision and behaviour of different WFS configurations are presented and interpreted.
Download On the use of spatial cues to improve binaural source separation
Motivated by the human hearing sense we devise a computational model suitable for the localization of many sources in stereo signals, and apply this to the separation of sound sources. The method employs spatial cues in order to resolve high-frequency phase ambiguities. More specifically we use relationships between the short time Fourier transforms (STFT) of the two signals in order to estimate the two most important spatial cues, namely time differences (TD) and level differences (LD) between the sensors. By using models of both free field wave propagation and head related transfer functions (HRTF), these cues are combined to form estimates of spatial parameters such as the directions of arrival (DOA). The theory is validated with the help of the experimental results presented in the paper.
Download Room simulation for binaural sound reproduction using measured spatiotemporal impulse responses
In binaural sound reproduction systems the incorporation of room simulation is important to improve sound source localisation capabilities. Thus, the localisation error can be decreased, while equivalently an enhanced externality (out of head localisation) is achieved. Previously proposed works are based on simple geometrical approaches for room simulation. In this paper an alternative method using measured room impulse responses (RIRs) is presented. Therefore, it is possible to obtain a convincing acoustical image of an existing room. The RIRs are measured using a circular microphone array to capture both temporal and spatial information of the desired room.
Download Automatic synthesis strategies for object-based dynamical physical models in musical acoustics
Current physics-based synthesis techniques tend to synthesize the interaction between different functional elements of a sound generator by treating it as a single system. However, when dealing with the physical modeling of complex sound generators this choice raises questions about the resulting flexibility of the adopted synthesis strategy. One way to overcome this problem is to approach it by individually synthesizing and discretizing the objects that contribute to the generation of sounds. In this paper we address the problem of how to automatize the process of physically modeling the interaction between objects, and how to make it dynamical. We will show that this can be done through the automatic definition and implementation of a topology model that adapts to the contact and proximity conditions between the considered objects.