Download Block Processing Strategies for Computationally Efficient Dynamic Range Controllers This paper presents several strategies for designing Dynamic Range Controllers when using a block-based processing scheme instead of sample-by-sample processing scheme. The processes of energy measurement, gain calculus, and time constant selection are executed only once per each new incoming block of samples. Then, a simple and continuous gain update is computed and applied sample-by-sample between continuous sample blocks to achieve good sound quality and performance. This approach allows reducing the computational cost needs while maintaining the flexibility and behavior of sample-by-sample processing solutions. Several implementation optimizations are also presented for reducing the computational cost and achieving a flexible and better sounding dynamic curve using configurable soft knees or gain tables. The proposed approach has been tested and implemented in a modern DSP, achieving satisfactory results with a considerable computational costs saving.
Download A Physically-motivated Triode Model for Circuit Simulations A new model for triodes of type 12AX7 is presented, featuring simple and continuously differentiable equations. The description is physically-motivated and enables a good replication of the grid current. Free parameters in the equations are fitted to reference data originated from measurements of practical triodes. It is shown, that the equations are able to characterize the properties of real tubes in good accordance. Results of the model itself and when embedded in an amplifier simulation are presented and align well.
Download A Simple and Efficient Fader Estimator for Broadcast Radio Unmixing This paper presents a framework for the estimation of the faders gain of a mixing console, in the context of broadcast radio production. The retrieval of the console state is generally only possible through a human-machine interface and does not permit the automatic processing of such information. A simple algorithm is provided to estimate the faders position from the different inputs and the output signal of the console. This method also allows the extraction of an additional unknown input, present in the mix output. An exhaustive study on the optimal parameter setting is then detailed, that shows good results on the estimation.
Download PVSOLA: A Phase Vocoder with Synchronized OverLap-Add In this paper we present an original method mixing temporal and spectral processing to reduce the phasiness in the phase vocoder. Phasiness is an inherent artifact of the phase vocoder that appears when a sound is slowed down. The audio is perceived as muffled, reverberant and/or moving away from the microphone. This is due to the loss of coherence between the phases across the bins of the Short-Term Fourier Transform over time. Here the phase vocoder is used almost as usual, except that its phases are regularly reset in order to keep them coherent. Phase reset consists in using a frame from the input signal for synthesis without modifying it. The position of that frame in the output audio is adjusted using cross-correlation, as is done in many temporal time-stretching methods. The method is compared with three state-of-the-art algorithms. The results show a significant improvement over existing processes although some test samples present artifacts.
Download Vivos Voco: A survey of recent research on voice transformations at IRCAM IRCAM has a long experience in analysis, synthesis and transformation of voice. Natural voice transformations are of great interest for many applications and can be combine with text-to-speech system, leading to a powerful creation tool. We present research conducted at IRCAM on voice transformations for the last few years. Transformations can be achieved in a global way by modifying pitch, spectral envelope, durations etc. While it sacrifices the possibility to attain a specific target voice, the approach allows the production of new voices of a high degree of naturalness with different gender and age, modified vocal quality, or another speech style. These transformations can be applied in realtime using ircamTools TR A X.Transformation can also be done in a more specific way in order to transform a voice towards the voice of a target speaker. Finally, we present some recent research on the transformation of expressivity.
Download Efficient Polynomial Implementation of the EMS VCS3 Filter Model A previously existing nonlinear differential equation system modeling the EMS VCS3 voltage controlled filter is reformulated here in polynomial form, avoiding the expensive computation of transcendent functions imposed by the original model. The new system is discretized by means of an implicit numerical scheme, and solved using Newton-Raphson iterations. While maintaining instantaneous controllability, the algorithm is both significantly faster and more accurate than the previous filter-based solution. A real time version of the model has been implemented under the PureData audio processing environment and as a VST plugin.
Download Interaction-optimized Sound Database Representation Interactive navigation within geometric, feature-based database representations allows expressive musical performances and installations. Once mapped to the feature space, the user’s position in a physical interaction setup (e.g. a multitouch tablet) can be used to select elements or trigger audio events. Hence physical displacements are directly connected to the evolution of sonic characteristics — a property we call analytic sound–control correspondence. However, automatically computed representations have a complex geometry which is unlikely to fit the interaction setup optimally. After a review of related work, we present a physical model-based algorithm that redistributes the representation within a user-defined region according to a user-defined density. The algorithm is designed to preserve the analytic sound-control correspondence property as much as possible, and uses a physical analogy between the triangulated database representation and a truss structure. After preliminary pre-uniformisation steps, internal repulsive forces help to spread points across the whole region until a target density is reached. We measure the algorithm performance relative to its ability to produce representations corresponding to user-specified features and to preserve analytic sound–control correspondence during a standard density-uniformisation task. Quantitative measures and visual evaluation outline the excellent performances of the algorithm, as well as the interest of the pre-uniformisation steps.
Download Realtime system for backing vocal harmonization A system for the synthesis of backing vocals by pitch shifting of a lead vocal signal is presented. The harmonization of the backing vocals is based on the chords which are retrieved from an accompanying instrument. The system operates completely autonomous without the need to provide the key of the performed song. This simplifies the handling of the harmonization effect. The system is designed to have realtime capability to be used as live sound effect.
Download Phantom Source Widening With Deterministic Frequency Dependent Time Delays We present a novel method to adjust the perceived width of a phantom source by varying the deterministic inter channel time difference (ICT D) in a pair of signals over frequency. In contrast to given literature that focuses on random phase over frequency, our paper considers a deterministic approach that is open to a more systematic evaluation. Two allpass structures are described, finite impulse response (FIR) and infinite impulse response (IIR), for phase-based phantom source widening and evaluated in a formal listening test. Varying ICT D over frequency essentially alters the inter-aural cross correlation coefficient at the ears of a listener and in this way provides a robust way to control the auditory source width. The subjective evaluation results fully support our observations for both noise and speech signals.
Download Implementing Real-Time Partitioned Convolution Algorithms on Conventional Operating Systems We describe techniques for implementing real-time partitioned convolution algorithms on conventional operating systems using two different scheduling paradigms: time-distributed (cooperative) and multi-threaded (preemptive). We discuss the optimizations applied to both implementations and present measurements of their performance for a range of impulse response lengths on a recent high-end desktop machine. We find that while the time-distributed implementation is better suited for use as a plugin within a host audio application, the preemptive version was easier to implement and significantly outperforms the time-distributed version despite the overhead of frequent context switches.