Download Digital Sound Synthesis of Brass Instruments by Physical Modeling The Functional Transformation Method (FTM) is an established method for sound synthesis by physical modeling, which has proven its feasibility so far by the application to strings and membranes. Based on integral transformations, it provides a discrete solution for continuous physical problems given in form of initialboundary-value problems. This paper extends the range of applications of the FTM to brass instruments. A full continuous physical model of the instrument, consisting of an air column, a mouthpiece and the player’s lips is introduced and solved in the discrete domain. It is shown, that the FTM is a suitable method also for sound synthesis of brass instruments.
Download Recent Advances in Physical Modeling with K- and W-Techniques Physical (or physics-based) modeling of musical instruments is one of the main research fields in computer music. A basic question, with increasing research interest recently, is to understand how different discrete-time modeling paradigms are interrelated and can be combined, whereby wave modeling with wave quantities (W-methods) and Kirchhoff quantities (K-methods) can be understood in the same theoretical framework. This paper presents recent results from the HUT Sound Source Modeling group, both in the form of theoretical discussions and by examples of Kvs. W-modeling in sound synthesis of musical instruments.
Download Computation of Nonlinear Filter Networks Containing Delay-Free Paths A method for solving filter networks made of linear and nonlinear filters is presented. The method is valid independently of the presence of delay-free paths in the network, provided that the nonlinearities in the system respect certain (weak) hypotheses verified by a wide class of real components: in particular, that the contribution to the output due to the memory of the nonlinear blocks can be extracted from each nonlinearity separately. The method translates into a general procedure for computing the filter network, hence it can serve as a testbed for offline testing of complex audio systems and as a starting point toward further code optimizations aimed at achieving real time.
Download Modal-Type Synthesis Techniques for Nonlinear Strings with an Energy Conservation Property There has recently been increased interest in the modelling of string vibration under large amplitude conditions, for sound synthesis purposes. A simple nonlinear model is given by the KirchhoffCarrier equation, which can be thought of as a generalization of the wave equation to the case for which the string tension is “modulated” by variations in the length of the string under deformation. Finite difference schemes are one means of approach for the simulation of nonlinear PDE systems; in this case, however, as the nonlinearity is spatially invariant, the solution may be broken down into sinusoidal components, much as in the linear case. More importantly, if time discretization is carried out in a particular way, it is possible to obtain a conserved energy in the numerical scheme, leading to a useful numerical stability guarantee, which can be difficult to obtain for strongly nonlinear systems. Numerical results are presented.
Download Wave Field Synthesis - A Promising Spatial Rendering Concept Modern convolution technologies offer possibilities to overcome principle shortcomings of loudspeaker stereophony by exploiting the Wave Field Synthesis (WFS) concept for rendering virtual spatial characteristics of sound events. Based on the Huygens principle loudspeaker arrays are reproducing a synthetic sound field around the listener, whereby the dry audio signal is combined with measured or modelled information about the room and the source’s position to enable the accurate reproduction of the source within its acoustical environment. Not surprisingly, basic and practical constraints of WFS systems limit the rendering accurateness and the perceived spatial audio quality to a certain degree, dependent on characteristic features and technical parameters of the sound field synthesis. However, recent developments have shown already that a number of applications could be possible in the near future. An attractive example is the synthesis of WFS and stereophony offering enhanced freedom in sound design as well as improved quality and more flexibility in practical playback situations for multichannel sound mixes.
Download Wave Field Synthesis - Generation and Reproduction of Natural Sound Environments Since the early days of stereo good spatial sound impression had been limited to a small region, the so-called sweet spot. About 15 years ago the concept of wave field synthesis (WFS) solving this problem has been invented at TU Delft, but due to its computational complexity it has not been used outside universities and research institutes. Today the progress of microelectronics makes a variety of applications of WFS possible, like themed environments, cinemas, and exhibition spaces. This paper will highlight the basics of WFS and discuss some of the solutions beyond the basics to make it work in applications.
Download Spatial Impulse Response Rendering Spatial Impulse Response Rendering (SIRR) is a recent technique for reproduction of room acoustics with a multichannel loudspeaker system. SIRR analyzes the direction of arrival and diffuseness of measured room responses within frequency bands. Based the analysis data, a multichannel response suitable for reproduction with any chosen surround loudspeaker setup is synthesized. When loaded to a convolving reverberator, the synthesized responses create a very natural perception of space corresponding to the measured room. In this paper, the SIRR method is described and listening test results are reviewed. The sound intensity based analysis is refined, and improvements for the synthesis of diffuse timefrequency components are discussed.
Download Binaural source localization In binaural signals, interaural time differences (ITDs) and interaural level differences (ILDs) are two of the most important cues for the estimation of source azimuths, i.e. the localization of sources in the horizontal plane. For narrow band signals, according to the duplex theory, ITD is dominant at low frequencies and ILD is dominant at higher frequencies. Based on the STFT spectra of binaural signals, a method is proposed for the combined evaluation of ITD and ILD for each individual spectral coefficient. ITD and ILD are related to the azimuth through lookup models. Azimuth estimates based on ITD are more accurate but ambiguous at higher frequencies due to phase wrapping. The less accurate but unambiguous azimuth estimates based on ILDs are used in order to select the closest candidate azimuth estimates based on ITDs, effectively improving the azimuth estimation. The method corresponds well with the duplex theory and also handles the transition from low to high frequencies gracefully. The relations between the ITD and ILD and the azimuth are computed from a measured set of head related transfer functions (HRTFs), yielding azimuth lookup models. Based on a study of these models for different subjects, parametric azimuth lookup models are proposed. The parameters of these models can be optimized for an individual subject whose HRTFs have been measured. In addition, subject independent lookup models are proposed, parametrized only by the distance between the ears, effectively enabling source localization for subjects whose HRTFs have not been measured.
Download Parametric Coding of Spatial Audio Recently, there has been a renewed interest in techniques for coding of stereo and multi-channel audio signals. Stereo and multichannel audio signals evoke an auditory spatial image in a listener. Thus, in addition to pure redundancy reduction, a receiver model which considers properties of spatial hearing may be used for reducing the bitrate. This has been done in previous techniques by considering the importance of interaural level difference cues at high frequencies and by considering the binaural masking level difference when computing the masked threshold for multiple audio channels. Recently, a number of more systematic and parameterized such techniques were introduced. In this paper an overview over a technique, denoted binaural cue coding (BCC), is given. BCC represents stereo or multichannel audio signals as a single or more downmixed audio channels plus side information. The side information contains the interchannel cues inherent in the original audio signal that are relevant for the perception of the properties of the auditory spatial image. The relation between the inter-channel cues and attributes of the auditory spatial image is discussed. Other applications of BCC are discussed, such as joint-coding of independent audio signals providing flexibility at the decoder to mix arbitrary stereo, multichannel, and binaural signals.
Download From Joint Stereo to Spatial Audio Coding - Recent Progress and Standardization Within the evolution of perceptual audio coding, there is a long history of exploiting techniques for joint coding of several audio channels of an audio program which are presented simultaneously. The paper describes how such techniques have progressed over time into the recent concept of spatial audio coding, as it is under standardization currently within the ISO/MPEG group. As a significant improvement over conventional techniques, this approach allows the representation of high quality multi-channel audio at bitrates of only 64kbit/s and below.