Download Analysis of piano tones using an inharmonic inverse comb filter This paper presents a filter configuration for canceling and separating partials from inharmonic piano tones. The proposed configuration is based on inverse comb filtering, in which the delay line is replaced with a high-order filter that has a proper phase response. Two filter design techniques are tested with the method: an FIR filter, which is designed using frequency sampling, and an IIR filter, which consists of a set of second-order allpass filters that match the desired group delay. It is concluded that it is possible to obtain more accurate results with the FIR filter, while the IIR filter is computationally more efficient. The paper shows that the proposed analysis method provides an effective and easy way of extracting the residual signal and selecting partials from piano tones. This method is suitable for analysis of recorded piano tones.
Download Sound transformation by descriptor using an analytic domain In many applications of sound transformation, such as sound design, mixing, mastering, and composition the user interactively searches for appropriate parameters. However, automatic applications of sound transformation, such as mosaicing, may require choosing parameters without user intervention. When the target can be specified by its synthesis context, or by example (from features of the example), “adaptive effects” can provide such control. But there exist few general strategies for building adaptive effects from arbitrary sets of transformations and descriptor targets. In this study, we decouple the usually direct link between analysis and transformation in adaptive effects, attempting to include more diverse transformations and descriptors in adaptive transformation, if at the cost of additional complexity or difficulty. We build an analytic model of a deliberately simple transformation-descriptor (TD) domain, and show some preliminary results.
Download Frame level audio similarity - A codebook approach Modeling audio signals via the long-term statistical distribution of their local spectral features – often denoted as bag of frames (BOF) approach – is a popular and powerful method to describe audio content. While modeling the distribution of local spectral features by semi-parametric distributions (e.g. Gaussian Mixture Models) has been studied intensively, we investigate a non-parametric variant based on vector quantization (VQ) in this paper. The essential advantage of the proposed VQ approach over stateof-the-art audio similarity measures is that the similarity metric proposed here forms a normed vector space. This allows for more powerful search strategies, e.g. KD-Trees or Local Sensitive Hashing (LSH), making content-based audio similarity available for even larger music archives. Standard VQ approaches are known to be computationally very expensive; to counter this problem, we propose a multi-level clustering architecture. Additionally, we show that the multi-level vector quantization approach (ML-VQ), in contrast to standard VQ approaches, is comparable to state-ofthe-art frame-level similarity measures in terms of quality. Another important finding w.r.t. the ML-VQ approach is that, in contrast to GMM models of songs, our approach does not seem to suffer from the recently discovered hub problem.
Download Comb-filter free audio mixing using STFT magnitude spectra and phase estimation This paper presents a new audio mixing algorithm which avoids comb-filter distortions when mixing an input signal with timedelayed versions of itself. Instead of a simple signal addition in the time domain, the proposed method calculates the short-time Fourier magnitude spectra of the input signals and adds them. The sum determines the output magnitude on the time-frequency plane, whereas a modified RTISI algorithm estimates the missing phase information. An evaluation using PEAQ shows that the proposed method yields much better results than temporal mixing for nonzero delays up to 10 ms.
Download Sliding with a constant Q The linear frequency (constant-bandwidth) scale of the FFT has long been recognised as a disadvantage for audio processing. Long analysis windows are required for adequate low-frequency resolution, while small windows offer lower latency, better handling of transients, and reduced computation cost. A constant-Q form of analysis offers the possibility of increased low-frequency resolution for a given window size, this resolution being essential for many fundamental processing tasks such as pitch shifting. We consider the application of the Sliding Discrete Fourier Transform to a Constant-Q analysis. The increased flexibility of sliding allows for a variety of data alignments, and we produce the mathematical formulation of these. Windowing in the frequency domain introduces further complications. Finally we consider the implementation of the analysis on both serial and parallel computers.
Download Vocal melody detection in the presence of pitched accompaniment using harmonic matching methods Vocal music is characterized by a melodically salient singing voice accompanied by one or more instruments. With a pitched instrument background, multiple periodicities are simultaneously present and the task becomes one of identifying and tracking the vocal pitch based on pitch strength and smoothness constraints. Frequency domain harmonic matching methods can be applied to detect pitch via the harmonically related frequencies that fit the signal’s measured spectral peaks. The specific spectral fitness measure is expected to influence the performance of vocal pitch detection depending on the nature of the polyphonic mixture. In this work, we consider Indian classical music which provides important examples of singing voice accompanied by strongly pitched instruments. It is shown that the spectral fitness measure of the two-way mismatch method is well suited to track vocal pitch in the presence of the pitched percussion with its strong but sparse harmonic structure. The detected pitch is further used to obtain a measure of voicing that reliably discriminates vocal segments from purely instrumental regions.
Download Multiple-F0 tracking based on a high-order HMM model This paper is about multiple-F0 tracking and the estimation of the number of harmonic source streams in music sound signals. A source stream is understood as generated from a note played by a musical instrument. A note is described by a hidden Markov model (HMM) having two states: the attack state and the sustain state. It is proposed to first perform the tracking of F0 candidates using a high-order hidden Markov model, based on a forward-backward dynamic programming scheme. The propagated weights are calculated in the forward tracking stage, followed by an iterative tracking of the most likely trajectories in the backward tracking stage. Then, the estimation of the underlying source streams is carried out by means of iteratively pruning the candidate trajectories in a maximum likelihood manner. The proposed system is evaluated by a specially constructed polyphonic music database. Compared with the frame-based estimation systems, the tracking mechanism improves significantly the accuracy rate.
Download Notes on Model-Based Non-Sationary Sinusoid Estimation Methods Using Derivatives This paper reviews the derivative method and explores its capacity for estimating time-varying sinusoids of complicated parameter variations. The method is reformulated on a generalized signal model. We show that under certain arrangements the estimation task becomes solving a linear system, whose coefficients can be computed from discrete samples using an integration-by-parts technique. Previous derivative and reassignment methods are shown to be special cases of this generic method. We include a discussion on the continuity criterion of window design for the derivative method. The effectiveness of the method and the window design criterion are confirmed by test results. We also show that, thanks to the generalization, off-model sinusoids can be approximated by the derivative method with a sufficiently flexible model setting.
Download A Frequency Domain Adaptive Algorithm for Wave Separation We propose a frequency domain adaptive algorithm for wave separation in wind instruments. Forward and backward travelling waves are obtained from the signals acquired by two microphones placed along the tube, while the separation filter is adapted from the information given by a third microphone. Working in the frequency domain has a series of advantages, among which are the ease of design of the propagation filter and its differentiation with respect to its parameters. Although the adaptive algorithm was developed as a first step for the estimation of playing parameters in wind instruments it can also be used, without any modifications, for other applications such as in-air direction of arrival (DOA) estimation. Preliminary results on these applications will also be presented.