Download Time-Scaling of Audio Signals with Multi-Scale Gabor Analysis
The phase vocoder is a standard frequency domain time-scaling technique suitable for polyphonic audio, but it generates annoying artifacts called phasiness, or loss of presence, and transient smearing, especially for high values of the time-scale parameter. In this paper, a new time-scaling algorithm for polyphonic audio signals is described. It uses a multi-scale Gabor analysis for lowfrequency content and a vocoder with phase-locking on transients for the residual signal and for high-frequency content. Compared to a phase-locking vocoder alone, our method significantly reduces both phasiness and transient smearing, especially for high values of the time-scale parameter. For time-contraction (time-scale parameters lower that one), the results seem to be more signaldependant.
Download Real-Time Pitch-Shifting of Musical Signals by a Time-Varying Factor Using Normalized Filtered Correlation Time-Scale Modification
This paper presents a high-quality real-time pitch-shifting algorithm with a time-varying factor for monophonic audio and musical signals. The pitch-shifting algorithm is based on the resampling and time-scale modification method. A new time-scale modification method has been developed which is called the Normalized Filtered Correlation Time-Scale Modification (NFC-TSM) method It uses a ring buffer for time-scaling. The best splicing point is searched in the normalized low-pass filtered signal using the Average Magnitude Difference Function (AMDF). The new method results in low-latency and high-quality pitch-shifting of musical signals.
Download Real-Time Reverb Simulation Using Arbitrary Models
We present a method for simulating reverberation in real-time using arbitrary object shapes. This method is an extension of digital plate reverberation where a dry signal is filtered through a physical model of an object vibrating in response to audio input. Using the modal synthesis method, we can simulate the vibration of many different shapes and materials in real time. Sound samples are available at the follwing website:
Download Adaptive FM Synthesis
This article describes an adaptive synthesis technique based on frequency (phase) modulation of arbitrary input signals. The background and motivation for the development of the technique, as well as related work, are discussed. A detailed description of delay line-based phase modulation of sinusoidal and complex signals is provided. The basic design of an implementation of the technique is presented and commented. A series of examples using four different instrumental sources are discussed. The results show a wide range of possible effects through the use of the technique, from addition of higher components, to changes in the odd-even harmonic balance and the introduction of controlled inharmonicity.
Download On the Application of RLS Adaptive Filtering for Voice Pitch Modification
This paper presents a pitch modification scheme, based on the recursive least-squares (RLS) adaptive algorithm, for speech and singing voice signals. The RLS filter is used to determine the linear prediction (LP) model on a sample-by-sample framework, as opposed to the LP-coding (LPC) method, which operates on a block basis. Therefore, an RLS-based approach is able to preserve the natural subtle variations on the vocal tract model, avoiding discontinuities in the synthesized signal and the inherent frame-delay associated to classic methods. The LP residual is modified in the synthesis stage in order to generate the output signal. Listening tests verify the overall quality of the synthesized signal using the RLS approach, indicating that this technique is suitable for realtime applications.
Download Sinusoid Modeling in a Harmonic Context
This article discusses harmonic sinusoid modeling. Unlike standard sinusoid analyzers, the harmonic sinusoid analyzer keeps close watch on partial harmony from an early stage of modeling, therefore guarantees the harmonic relationship among the sinusoids. The key element in harmonic sinusoid modeling is the harmonic sinusoid particle, which can be found by grouping short-time sinusoids. Instead of tracking short-time sinusoids, the harmonic tracker operates on harmonic particles directly. To express harmonic partial frequencies in a compact and robust form, we have developed an inequality-based representation with adjustable tolerance on frequency errors and inharmonicity, which is used in both the grouping and tracking stages. Frequency and amplitude continuity criteria are considered for tracking purpose. Numerical simulations are performed on simple synthesized signals.
Download Object Coding of Harmonic Sounds Using Sparse and Structured Representations
Object coding allows audio compression at extremely low bit-rates, provided that the objects are correctly modelled and identified. In this study, a codec has been implemented on the basis of a sparse decomposition of the signal with a dictionary of InstrumentSpecific Harmonic atoms. The decomposition algorithm extracts “molecules” i.e. linear combinations of such atoms, considered as note-like objects. Thus, they can be coded efficiently using notespecific strategies. For signals containing only harmonic sounds, the obtained bitrates are very low, typically around 2 kbs, and informal listening tests against a standard sinusoidal coder show promising performances.
Download Adaptive Threshold Determination for Spectral Peak Classification
A new approach to adaptive threshold selection for classification of peaks of audio spectra is presented. We here extend the previous work on classification of sinusoidal and noise peaks based on a set of spectral peak descriptors in a twofold way: on one hand we propose a compact sinusoidal model where all the modulation parameters are defined with respect to the analysis window. This fact is of great importance as we recall that the STFT spectra are closely related to the analysis window properties. On the other hand, we design a threshold selection algorithm that allows us to control the decision thresholds in an intuitive manner. The decision thresholds calculated from the relationships established between the noise power in the signal and the distributions of sinusoidal peaks assures that all peaks described as sinusoidal will be correctly classified. We also show that the threshold selection algorithm can be used for different types of analysis windows with only a slight parameter readjustment.
Download Real-time Audio Processing via Segmented Wavelet Transform
In audio applications it is often necessary to process the signal in “real time”. The method of segmented wavelet transform (SegWT) makes it possible to compute the discrete-time wavelet transform of a signal segment-by-segment, not using the classical “windowing”. This means that the method could be utilized for wavelettype processing of an audio signal in real time, or alternatively in case we just need to process a long signal, but there is insufficient computational memory capacity for it (e.g. in the DSPs). In the paper, the principle of the segmented forward wavelet transform is explained and the algorithm is described in detail.
Download Statistical Measures of Early Reflections of Room Impulse Responses
An impulse response of an enclosed reverberant space is composed of three basic components: the direct sound, early reflections and late reverberation. While the direct sound is a single event that can be easily identified, the division between the early reflections and late reverberation is less obvious as there is a gradual transition between the two. This paper explores two statistical measures that can aid in determining a point in time where the early reflections have transitioned into late reverberation. These metrics exploit the similarities between late reverberation and Gaussian noise that are not commonly found in early reflections. Unlike other measures, these need no prior knowledge about the rooms such as geometry or volume.