Download Wave field synthesis interaction with the listening environment, improvements in the reproduction of virtual sources situated inside the listening room Holophonic sound reproduction using Wave Field Synthesis (WFS) [1] aims at recreating a virtual spatialized sound scene in an extended area. Applying this technique to synthesize virtual sources located within an indoor environment can create striking audio effects in the context of virtual or augmented reality applications. However, interactions of the synthesized sound field with the listening room must be taken into account for they cause modifications in the resulting sound field. This paper enumerates some of these interactions according to different virtual scene configurations and applications. Particular attention is paid to the reproduction of the sound source directivity and to the reproduction of a room effect coherent with the real environment. Solutions for synthesizing the directivity of the source and the associated room effect are proposed and discussed around simulations, developpements and first perceptual validation.
Download Monitoring distance effect with wave field synthesis Wave Field Synthesis (WFS) [1] rendering allows the reproduction of virtual point sources. Depending on source positioning, the wave front synthesized in the listening area exhibits a given curvature that is responsible for a spatial perspective sensation. It is then possible to monitor the distance of a source with a “holophonic distance” parameter concurrently with conventional distance cues based on the control of direct/reverberation ratio. Presentation of this holophonic distance is made and then discussed in the context of authoring sound scenes in WFS installations. Three goals to this work: Introducing WFS to sound engineers in an active listening test where they manipulate different parameters for the construction of a sound scene. Assessing the perceptual relevance of the holophonic distance modifications. Studying the possible link between the holophonic distance parameter and conventional subjective distance parameters traditionally used by sound engineers.
Download An expressive real-time sound model of rolling This paper describes the structure and potential of a real-time sound model of “rolling”. The work has it’s background and motivation in the ecological approach of psychoacoustics. Scope of interest is the efficient and clear (possibly exaggerated) acoustic expression, cartoonification, of certain ecological attributes rather than realistic simulations for their own sake. To this end, different techniques of sound generation are combined in a hybrid hierarchical structure. A physics-based algorithm (section 2) of impact-interaction at the audio-core is surrounded by higher-level structures that explicitely model macroscopic characteristics (section 5). Another connecting audio-level algorithm, the “rolling-filter”, reduces the (3-dimensional) geometry of the rolling-contact to the one dimension of the impactinteraction-model (section 3).
Download EGOSOUND, an egocentric, interactive and real-time approach of sound space Developed in the context of virtual environments linked to artistic creation, Egosound was conceived to visualize multiphonic sound space in an egocentric way. Ambisonic algorithms and a graphic editor for trajectories are implemented. An example of artistic project with Egosound is exposed. Parts of the software engine of Egosound are now available for download as external objects for Pure Data and Max/Msp graphical programming environments.
Download Musically expressive sound textures from generalized audio We present a method of musically expressive synthesis-by-analysis that takes advantage of recent advancements in auditory scene analysis and sound separation algorithms. Our model represents incoming audio as a sub-conceptual model using statistical decorrelation techniques that abstract away individual auditory events, leaving only the gross parameters of the sound– the “eigensound” or generalized spectral template. Using these approaches we present various optimization guidelines and musical enhancements, specifically with regards to the beat and temporal nature of the sounds, with an eye towards real-time effects and synthesis. Our model results in completely novel and pleasing sound textures that can be varied with parameter tuning of the “unmixing” weight matrix.
Download GUI front-end for spectral warping This paper describes a software tool developed in the Java language to facilitate time and frequency warping of audio spectra. The application utilises the Java Advanced Image Processing (AIP) API which contains classes for image manipulation and, in particular, for non-linear warping using polynomial transformations. Warping of spectral representations is fundamental to sound processing techniques such as sound transformation and morphing. Dynamic time warping has been the method of choice for many implementations of temporal and spectral alignment for morphing. This tool offers greater advantage by providing an interactive approach to warping, thus allowing greater flexibility in achieving a desired transformation. This application can then be used as input to a signal synthesis routine, which will recover the transformed sound.
Download Extracting automatically the perceived intensity of music titles We address the issue of extracting automatically high-level musical descriptors out of their raw audio signal. This work focuses on the extraction of the perceived intensity of music titles, that evaluates how energic the music is perceived by listeners. We present here first the perceptive tests that we have conducted, in order to evaluate the relevance and the universality of the perceived intensity descriptor. Then we present several methods used to extract relevant features used to build automatic intensity extractors: usual Mpeg7 low level features, empirical method, and features automatically found using our Extractor Discovery System (EDS), and compare the final performances of their extractors.
Download Perceptual evaluation of weighted multi-channel binaural format This paper deals with perceptual evaluation of an efficient method for creating 3D sound material on headphones. The two main issues of the classical two-channel binaural rendering technique are computational cost and individualization. These two aspects are emphasized in the context of a general-purpose 3D auditory display. The multi-channel binaural synthesis tries to provide solutions. Several studies have been dedicated to this approach where the minimum-phase parts of the Head-Related Transfer Functions (HRTFs) are linearly decomposed in the purpose of achieving a separation of the direction and frequency variables. The present investigation aims at improving this model, making use of weighting functions applied to the reconstruction error, in order to focus modeling effort on the most perceptually relevant cues in the frequency or spatial domain. For validating the methodology, a localization listening test is undertaken, with static stimuli, using a reporting interface which allows a minimization of interpretation errors. Beyond the optimization of the binaural implementation, one of the main questions addressed by the study is the search for a perceptually relevant definition of a reconstruction error.
Download Extraction of the excitation point location on a string using weighted least-square estimation of a comb filter delay This paper focuses on the extraction of the excitation point location on a guitar string by an iterative estimation of the structural parameters of the spectral envelope. We propose a general method to estimate the plucking point location, working into two stages: starting from a measure related to the autocorrelation of the signal as a first approximation, a weighted least-square estimation is used to refine a FIR comb filter delay value to better fit the measured spectral envelope. This method is based on the fact that, in a simple digital physical model of a plucked-string instrument, the resonant modes translate into an all-pole structure while the initial conditions (a triangular shape for the string and a zero-velocity at all points) result in a FIR comb filter structure.
Download A non-linear technique for room impulse response estimation Most techniques used to estimate the transfer function (or impulse response) of an acoustical space operate along similar principles. A known, broadband signal is transmitted at one point in the room whilst being simultaneously recorded at another. A matched-filter is then used to compress the energy in the transmission waveform in time, forming an approximate impulse response. Finally, equalisation filtering is used to remove any colouration and phase distortion caused by the non-uniform energy-spectrum of the transmission and/or the non-ideal response of the loudspeaker/microphone combination. In this paper, the limitations of this conventional technique will be highlighted, especially when using low-cost equipment. An alternative, non-linear deconvolution technique is proposed which will be shown to give superior performance when using non-ideal equipment.