Download Re-Thinking Sound Separation: Prior Information and Additivity Constraint in Separation Algorithms
In this paper, we study the effect of prior information on the quality of informed source separation algorithms. We present results with our system for solo and accompaniment separation and contrast our findings with two other state-of-the art approaches. Results suggest current separation techniques limit performance when compared to extraction process of prior information. Furthermore, we present an alternative view of the separation process where the additivity constraint of the algorithm is removed in the attempt to maximize obtained quality. Plausible future directions in sound separation research are discussed.
Download Study of Regularizations and Constraints in NMF-Based Drums Monaural Separation
Drums modelling is of special interest in musical source separation because of its widespread presence in western popular music. Current research has often focused on drums separation without specifically modelling the other sources present in the signal. This paper presents an extensive study of the use of regularizations and constraints to drive the factorization towards the separation between percussive and non-percussive music accompaniment. The proposed regularizations control the frequency smoothness of the basis components and the temporal sparseness of the gains. We also evaluated the use of temporal constraints on the gains to perform the separation, using both ground truth manual annotations (made publicly available) and automatically extracted transients. Objective evaluation of the results shows that, while optimal regularizations are highly dependent on the signal, drum event position contains enough information to achieve a high quality separation.
Download Reverse Engineering Stereo Music Recordings Pursuing an Informed Two-Stage Approach
A cascade reverse engineering approach is presented which uses an explicit model of the music production chain. The model considers both the mixing and the mastering stages and incorporates a parametric signal model. The approach is further pursued in an informed scenario. This means that the model parameters are attached in the form of auxiliary data to the mastered mix. They are resorted to afterwards in order to undo the mastering and the mixing. The validity of the approach is demonstrated on a stereo mixture.
Download Source Separation and Analysis of Piano Music Signals Using Instrument-Specific Sinusoidal Model
Many existing monaural source separation systems use sinusoidal modeling to represent pitched musical sounds during the separation process. In these sinusoidal modeling systems, a musical sound is represented by a sum of time-varying sinusoidal components, and the goal of source separation is to estimate the parameters of each component. Here, we propose an instrument-specific sinusoidal model tailored for a piano tone. Based on our proposed Piano Model, we develop a monaural source separation system to extract each individual tone from mixture signals of piano tones and at the same time, to identify the intensity and adjust the onset of each tone for characterizing the nuance of the music performance. The major difficulty of the source separation problem is to resolve overlapping partials. Our solution collects the training data from isolated tones to train our Piano Model which can capture the common properties across the reappearance of pitches that helps to separate the mixtures. This approach enables high separation quality even for the case of octaves in which the partials of the upper tone completely overlap with those of the lower tone. The results show that our proposed system gives robust and accurate separation of piano tone signal mixtures (including octaves), with the quality significantly better than those reported in the previous work.
Download Low-Latency Bass Separation Using Harmonic-Percussion Decomposition
Many recent approaches to musical source separation rely on modelbased inference methods that take into account the signal’s harmonic structure. To address the particular case of low-latency bass separation, we propose a method that combines harmonic decomposition using a Tikhonov regularization-based algorithm, with the peak contrast analysis of the pitch likelihood function. Our experiment compares the separation performance of this method to a naive low-pass filter, a state-of-the-art NMF-based method and a near-optimal binary mask. The proposed low-latency method achieves results similar to the NMF-based high-latency approach at a lower computational cost. Therefore the method is valid for real-time implementations.
Download A 3D Multi-Plate Environment for Sound Synthesis
In this paper, a physics-based sound synthesis environment is presented which is composed of several plates, under nonlinear conditions, coupled with the surrounding acoustic field. Equations governing the behaviour of the system are implemented numerically using finite difference time domain methods. The number of plates, their position relative to a 3D computational enclosure and their physical properties can all be specified by the user; simple control parameters allow the musician/composer to play the virtual instrument. Spatialised sound outputs may be sampled from the simulated acoustic field using several channels simultaneously. Implementation details and control strategies for this instrument will be discussed; simulations results and sound examples will be presented.
Download Pure Data External for Reactive HMM-Based Speech and Singing Synthesis
In this paper, we present the recent progress in the M AGE project. M AGE is a library for reactive HMM-based speech and singing synthesis. Here, it is integrated as a Pure Data external, called mage~, which provides reactive voice quality, prosody and identity manipulation combined with contextual control. mage~ brings together the high-quality, natural and expressive speech of HMMbased speech synthesis with high flexibility and reactive control over the speech production level. Such an object provides a basis for further research in gesturally-controlled speech synthesis. It is an object that can “listen” and reactively adjust itself to its environment. Further in this work, based on mage~ we create different interfaces and controllers in order to explore the realtime, expressive and interactive nature of speech.
Download Simulation of Textured Audio Harmonics Using Random Fractal Phaselets
We present a method of simulating audio signals using the principles of random fractal geometry which, in the context of this paper, is concerned with the analysis of statistically self-affine ‘phaselets’. The approach is used to generate audio signals that are characterised by texture and timbre through the Fractal Dimension such as those associated with bowed stringed instruments. The paper provides a short overview on potential simulation methods using Artificial Neural Networks and Evolutionary Computing and on the problems associated with using a deterministic approach based on solutions to the acoustic wave equation. This serves to quantify the origins of the ‘noise’ associated with multiple scattering events that characterise texture and timbre in an audio signal. We then explore a method to compute the phaselet of a phase signal which is the primary phase function from which a phase signal is, to a good approximation, a periodic replica and show that, by modelling the phaselet as a random fractal signal, it can be characterised by the Fractal Dimension. The Fractal Dimension is then used to synthesise a phaselet from which the phase function is computed through multiple concatenations of the phaselet. The paper provides details of the principal steps associated with the method considered and examines some example results, providing a URL to m-coded functions for interested readers to repeat the results obtained and develop the algorithms further.
Download Source Filter Model For Expressive Gu-Qin Synthesis and its iOS App
Gu-Qin as a venerable Chinese plucked-string instrument has its unique performance techniques and enchanting sounds. It is on the UNESCO Representative List of the Intangible Cultural Heritage of Humanity. It is one of the oldest Chinese solo instruments. The variation of Gu-Qin sound is so large that carefullydesigned controls of its computer synthesizer are necessary. We developed a parametric source-filter model for re-synthesizing expressive Gu-Qin notes. It is capable to cover as many as possible combinations of Gu-Qin’s performance techniques. In this paper, a brief discussion of Gu-Qin playing and its special tablature notation are made for understanding the relationship between its performance techniques and its sounds. This work includes a Gu-Qin’s musical notation system and a source-filter model based synthesizer. In addition, we implement an iOS app to demonstrate its low computation complexity and robustness. It is easy to perform improvisation of the sounds because of its friendly user interfaces.
Download Extended Source-Filter Model for Harmonic Instruments for Expressive Control of Sound Synthesis and Transformation
In this paper we present a revised and improved version of a recently proposed extended source-filter model for sound synthesis, transformation and hybridization of harmonic instruments. This extension focuses mainly on the application for impulsively excited instruments like piano or guitar, but also improves synthesis results for continuously driven instruments including their hybrids. This technique comprises an extensive analysis of an instruments sound database, followed by the estimation of a generalized instrument model reflecting timbre variations according to selected control parameters. Such an instrument model allows for natural sounding transformations and expressive control of instrument sounds regarding its control parameters.