Download Asymmetries make the difference: A nonlinear model of transistor-based analog ring modulators This work analyzes analog ring modulators based on bipolar transistors, such as the EMS VCS3 and the Doepfer A-114. It is shown that the perfectly symmetric standard model from literature [1][2] does not suffice to describe crucial first-order effects. A detailed analysis of the circuit using mismatched parts is performed. The insights gained from this analysis are used to formulate a digital model which can be easily implemented and which captures the essential audible effects.
Download Instantaneous Harmonic Analysis for Vocal Processing The paper considers the application of instantaneous harmonic analysis to a real-time vocal processing system for pitch, timbre and time-scale modifications. The analysis technique is based on narrow band filtering using special analysis filters with frequency-modulated impulse response. The main advantage of the technique is high accuracy of harmonic parameters estimation that provides adequate harmonic/noise separation and artifact free implementing of voice modifications. The processing methods described in the paper are based on the harmonic+noise model.
Download Automated Equalization for Room Resonance Suppression Estimating room resonances in locations of big events and looking for counter-measures are normally done by sound engineers, mainly before the beginning of the event. In this paper an automation to enhance the audio quality in event rooms by suppressing the room resonances with a parametric equalizer of several high-Q peak filters is proposed. The room characteristics can be identified with few measurements in the listening area during the event, without applying an additional measuring signal (using its original sound signal). Based on this room characteristics the equalization filters are automatically designed. The results of several rooms tested with the automated equalization for room resonance suppression are presented as well as a discussion on the covered topics.
Download The Influence of Small Variations in a Simplified Guitar Amplifier Model A strongly simplified guitar amplifier model, consisting of four stages, is presented. The exponential sweep technique is used to measure the frequency dependent harmonic spectra. The influence of small variations of the system parameters on the harmonic components is analyzed. The differences of the spectra are explained and visualized.
Download Informed Selection of Frames for Music Similarity Computation In this paper we present a new method to compute frame based audio similarities, based on nearest neighbour density estimation. We do not recommend it is as a practical method for large collections because of the high runtime. Rather, we use this new method for a detailed analysis to get a deeper insight on how a bag of frames approach (BOF) determines similarities among songs, and in particular, to identify those audio frames that make two songs similar from a machine’s point of view. Our analysis reveals that audio frames of very low energy, which are of course not the most salient with respect to human perception, have a surprisingly big influence on current similarity measures. Based on this observation we propose to remove these low-energy frames before computing song models and show, via classification experiments, that the proposed frame selection strategy improves the audio similarity measure.
Download Improvement of Acoustic Localization Using the STSA denoising with a novel Suppression Rule This paper proposes innovative de-noise filters in a framework, whose aim is the localization of an acoustic source in a noisy environment. The main focuses are the automatic detection of transient sound events and the separation of the events of interest from the noise. A microphone array is used to capture timespatial information and an adaptive filter can be initialized to learn the ambient noise spectrum when signals of interest are absent. We propose an algorithm based on the Short Time Spectral Attenuation method to remove the noise from each sensor of the array, before the source localization task is performed. The Time Difference Of Arrival (TDOA) methods are used for multiple sources localization. The experimental results show the efficiency of our framework in stationary noisy environments.
Download Re-targeting Expressive Musical Style Using a Machine-Learning Method Expressive musical performing style involves more than what is simply represented on the score. Performers imprint their personal style on each performances based on their musical understanding. Expressive musical performing style makes the music come alive by shaping the music through continuous variation. It is observed that the musical style can be represented by appropriate numerical parameters, where most parameters are related to the dynamics. It is also observed that performers tends to perform music sections and motives of similar shape in similar ways, where music sections and motives can be identified by an automatic phrasing algorithm. An experiment is proposed for producing expressive music from raw quantized music files using machine-learning methods like Support Vector Machines. Experimental results show that it is possible to induce some of a performer’s style by using the music parameters extracted from the audio recordings of their real performance.
Download Phase-Change based Tuning for Automatic Chord Recognition This paper focuses on automatic extraction of acoustic chord sequences from a piece of music. Firstly, the evaluation of a set of different windowing methods for Discrete Fourier Transform is investigated in terms of their efficiency. Then, a new tuning solution is introduced, based on a method developed in the past for phase vocoder. Pitch class profile vectors, that represent harmonic information, are extracted from the given audio signal. The resulting chord sequence is obtained by running a Viterbi decoder on trained hidden Markov models. We performed several experiments using the proposed technique. Results obtained on 175 manually-labeled songs provided an accuracy that is comparable to the state of the art.
Download An iterative Segmentation Algorithm for Audio Signal Spectra Depending on Local Centers of Gravity Modern music production and sound generation often relies on manipulation of pre-recorded pieces of audio, so-called samples, taken from a huge database. Consequently, there is a increasing request to extensively adapt these samples to any new musical context in a flexible way. For this purpose, advanced digital signal processing is needed in order to realize audio effects like pitch shifting, time stretching or harmonization. Often, a key part of these processing methods is a signal adaptive, block based spectral segmentation operation. Hence, we propose a novel algorithm for such a spectral segmentation based on local centers of gravity (COG). The method was originally developed as part of a multiband modulation decomposition for audio signals. Nevertheless, this algorithm can also be used in the more general context of improved vocoder related applications.
Download Finding Latent Sources in Recorded Music with a Shift-invariant HDP We present the Shift-Invariant Hierarchical Dirichlet Process (SIHDP), a nonparametric Bayesian model for modeling multiple songs in terms of a shared vocabulary of latent sound sources. The SIHDP is an extension of the Hierarchical Dirichlet Process (HDP) that explicitly models the times at which each latent component appears in each song. This extension allows us to model how sound sources evolve over time, which is critical to the human ability to recognize and interpret sounds. To make inference on large datasets possible, we develop an exact distributed Gibbs sampling algorithm to do posterior inference. We evaluate the SIHDP’s ability to model audio using a dataset of real popular music, and measure its ability to accurately find patterns in music using a set of synthesized drum loops. Ultimately, our model produces a rich representation of a set of songs consisting of a set of short sound sources and when they appear in each song.