Download Wave Field Synthesis - A Promising Spatial Rendering Concept
Modern convolution technologies offer possibilities to overcome principle shortcomings of loudspeaker stereophony by exploiting the Wave Field Synthesis (WFS) concept for rendering virtual spatial characteristics of sound events. Based on the Huygens principle loudspeaker arrays are reproducing a synthetic sound field around the listener, whereby the dry audio signal is combined with measured or modelled information about the room and the source’s position to enable the accurate reproduction of the source within its acoustical environment. Not surprisingly, basic and practical constraints of WFS systems limit the rendering accurateness and the perceived spatial audio quality to a certain degree, dependent on characteristic features and technical parameters of the sound field synthesis. However, recent developments have shown already that a number of applications could be possible in the near future. An attractive example is the synthesis of WFS and stereophony offering enhanced freedom in sound design as well as improved quality and more flexibility in practical playback situations for multichannel sound mixes.
Download Wave Field Synthesis - Generation and Reproduction of Natural Sound Environments
Since the early days of stereo good spatial sound impression had been limited to a small region, the so-called sweet spot. About 15 years ago the concept of wave field synthesis (WFS) solving this problem has been invented at TU Delft, but due to its computational complexity it has not been used outside universities and research institutes. Today the progress of microelectronics makes a variety of applications of WFS possible, like themed environments, cinemas, and exhibition spaces. This paper will highlight the basics of WFS and discuss some of the solutions beyond the basics to make it work in applications.
Download Spatial Impulse Response Rendering
Spatial Impulse Response Rendering (SIRR) is a recent technique for reproduction of room acoustics with a multichannel loudspeaker system. SIRR analyzes the direction of arrival and diffuseness of measured room responses within frequency bands. Based the analysis data, a multichannel response suitable for reproduction with any chosen surround loudspeaker setup is synthesized. When loaded to a convolving reverberator, the synthesized responses create a very natural perception of space corresponding to the measured room. In this paper, the SIRR method is described and listening test results are reviewed. The sound intensity based analysis is refined, and improvements for the synthesis of diffuse timefrequency components are discussed.
Download Binaural source localization
In binaural signals, interaural time differences (ITDs) and interaural level differences (ILDs) are two of the most important cues for the estimation of source azimuths, i.e. the localization of sources in the horizontal plane. For narrow band signals, according to the duplex theory, ITD is dominant at low frequencies and ILD is dominant at higher frequencies. Based on the STFT spectra of binaural signals, a method is proposed for the combined evaluation of ITD and ILD for each individual spectral coefficient. ITD and ILD are related to the azimuth through lookup models. Azimuth estimates based on ITD are more accurate but ambiguous at higher frequencies due to phase wrapping. The less accurate but unambiguous azimuth estimates based on ILDs are used in order to select the closest candidate azimuth estimates based on ITDs, effectively improving the azimuth estimation. The method corresponds well with the duplex theory and also handles the transition from low to high frequencies gracefully. The relations between the ITD and ILD and the azimuth are computed from a measured set of head related transfer functions (HRTFs), yielding azimuth lookup models. Based on a study of these models for different subjects, parametric azimuth lookup models are proposed. The parameters of these models can be optimized for an individual subject whose HRTFs have been measured. In addition, subject independent lookup models are proposed, parametrized only by the distance between the ears, effectively enabling source localization for subjects whose HRTFs have not been measured.
Download Parametric Coding of Spatial Audio
Recently, there has been a renewed interest in techniques for coding of stereo and multi-channel audio signals. Stereo and multichannel audio signals evoke an auditory spatial image in a listener. Thus, in addition to pure redundancy reduction, a receiver model which considers properties of spatial hearing may be used for reducing the bitrate. This has been done in previous techniques by considering the importance of interaural level difference cues at high frequencies and by considering the binaural masking level difference when computing the masked threshold for multiple audio channels. Recently, a number of more systematic and parameterized such techniques were introduced. In this paper an overview over a technique, denoted binaural cue coding (BCC), is given. BCC represents stereo or multichannel audio signals as a single or more downmixed audio channels plus side information. The side information contains the interchannel cues inherent in the original audio signal that are relevant for the perception of the properties of the auditory spatial image. The relation between the inter-channel cues and attributes of the auditory spatial image is discussed. Other applications of BCC are discussed, such as joint-coding of independent audio signals providing flexibility at the decoder to mix arbitrary stereo, multichannel, and binaural signals.
Download From Joint Stereo to Spatial Audio Coding - Recent Progress and Standardization
Within the evolution of perceptual audio coding, there is a long history of exploiting techniques for joint coding of several audio channels of an audio program which are presented simultaneously. The paper describes how such techniques have progressed over time into the recent concept of spatial audio coding, as it is under standardization currently within the ISO/MPEG group. As a significant improvement over conventional techniques, this approach allows the representation of high quality multi-channel audio at bitrates of only 64kbit/s and below.
Download Low Complexity Parametric Stereo Coding in MPEG-4
Parametric stereo coding in combination with a State-of-the-Art coder for the underlying monaural audio signal results in the most ef cient coding scheme for stereo signals at very low bit rates available today. This paper reviews those aspects of the parametric stereo paradigm that are important for audio coding applications. A complete parametric stereo coding system is presented, which was recently standardized in MPEG-4 Audio. Using complex modulated lter banks, it allows implementation with low computational complexity. The system is backward compatible and enables high quality stereo coding at total bit rate of 24 kbit/s when used in combination with aacPlus.
Download Computational Real-Time Sound Simulation of Rain
Real time sound synthesis in computer games using physical modeling is an area of great potential. To date, most sounds are prerecorded to match a certain event. Instead, by using a model to describe the sound producing event, a number of problems encountered when using pre-recorded sounds can be avoided. This paper deals with the application of physical modeling to the sound synthesis of rainfall. The implementation of a real-time simulation and a graphics interface allowing an interactive control of rainfall sound are discussed.
Download A Strategy for the Modular Implementation of Physics-Based Models
For reasons of practical handling as well as optimization of the processes of development and implementation, it is desirable to realize realtime models of sound emitting physical processes in a modular fashion that reflects an intuitively understandable structure of the underlying scenario. At the same time, in discrete– time algorithms based on physical descriptions, the occurance of non–computable instantaneous feedback loops has to be avoided. The latter obstacle prohibits the naive cross-connection of input– output signal processing blocks. The following paper presents an approach to gain modularity in the implementation of physicsbased models, while preventing non–computable loops, that can be applied to a wide class of systems. The strategy has been realized pratically in the development of realtime sound models in the course of the Sounding Object [1] European research project.
Download Implementing Loudness Models in MATLAB
In the field of psychoacoustic analysis the goal is to construct a transformation that will map a time waveform into a domain that best captures the response of a human perceiving sound. A key element of such transformations is the mapping between the sound intensity in decibels and its actual perceived loudness. A number of different loudness models exist to achieve this mapping. This paper examines implementation strategies for some of the more well-known models in the Matlab software environment.