Download Differentiable grey-box modelling of phaser effects using frame-based spectral processing Machine learning approaches to modelling analog audio effects have seen intensive investigation in recent years, particularly in the context of non-linear time-invariant effects such as guitar amplifiers. For modulation effects such as phasers, however, new challenges emerge due to the presence of the low-frequency oscillator which controls the slowly time-varying nature of the effect. Existing approaches have either required foreknowledge of this control signal, or have been non-causal in implementation. This work presents a differentiable digital signal processing approach to modelling phaser effects in which the underlying control signal and time-varying spectral response of the effect are jointly learned. The proposed model processes audio in short frames to implement a time-varying filter in the frequency domain, with a transfer function based on typical analog phaser circuit topology. We show that the model can be trained to emulate an analog reference device, while retaining interpretable and adjustable parameters. The frame duration is an important hyper-parameter of the proposed model, so an investigation was carried out into its effect on model accuracy. The optimal frame length depends on both the rate and transient decay-time of the target effect, but the frame length can be altered at inference time without a significant change in accuracy.
Download A Coupled Resonant Filter Bank for the Sound Synthesis of Nonlinear Sources This paper is concerned with the design of efficient and controllable filters for sound synthesis purposes, in the context of the generation of sounds radiated by nonlinear sources. These filters are coupled and generate tonal components in an interdependent way, and are intended to emulate realistic perceptually salient effects in musical instruments in an efficient manner. Control of energy transfer between the filters is realized by defining a matrix containing the coupling terms. The generation of prototypical sounds corresponding to nonlinear sources with the filter bank is presented. In particular, examples are proposed to generate sounds corresponding to impacts on thin structures and to the perturbation of the vibration of objects when it collides with an other object. The different sound examples presented in the paper and available for listening on the accompanying site tend to show that a simple control of the input parameters allows to generate sounds whose evocation is coherent, and that the addition of random processes allows to significantly improve the realism of the generated sounds.
Download Sample Rate Independent Recurrent Neural Networks for Audio Effects Processing In recent years, machine learning approaches to modelling guitar amplifiers and effects pedals have been widely investigated and have become standard practice in some consumer products. In particular, recurrent neural networks (RNNs) are a popular choice for modelling non-linear devices such as vacuum tube amplifiers and distortion circuitry. One limitation of such models is that they are trained on audio at a specific sample rate and therefore give unreliable results when operating at another rate. Here, we investigate several methods of modifying RNN structures to make them approximately sample rate independent, with a focus on oversampling. In the case of integer oversampling, we demonstrate that a previously proposed delay-based approach provides high fidelity sample rate conversion whilst additionally reducing aliasing. For non-integer sample rate adjustment, we propose two novel methods and show that one of these, based on cubic Lagrange interpolation of a delay-line, provides a significant improvement over existing methods. To our knowledge, this work provides the first in-depth study into this problem.
Download Digitizing the Schumann PLL Analog Harmonizer The Schumann Electronics PLL is a guitar effect that uses hardwarebased processing of one-bit digital signals, with op-amp saturation and CMOS control systems used to generate multiple square waves derived from the frequency of the input signal. The effect may be simulated in the digital domain by cascading stages of statespace virtual analog modeling and algorithmic approximations of CMOS integrated circuits. Phase-locked loops, decade counters, and Schmitt trigger inverters are modeled using logic algorithms, allowing for the comparable digital implementation of the Schumann PLL. Simulation results are presented.
Download Real-Time Guitar Synthesis The synthesis of guitar tones was one of the first uses of physical modeling synthesis, and many approaches (notably digital waveguides) have been employed. The dynamics of the string under playing conditions is complex, and includes nonlinearities, both inherent to the string itself, and due to various collisions with the fretboard, frets and a stopping finger. All lead to important perceptual effects, including pitch glides, rattling against frets, and the ability to play on the harmonics. Numerical simulation of these simultaneous strong nonlinearities is challenging, but recent advances in algorithm design due to invariant energy quadratisation and scalar auxiliary variable methods allow for very efficient and provably numerically stable simulation. A new design is presented here that does not employ costly iterative methods such as the Newton-Raphson method, and for which required linear system solutions are small. As such, this method is suitable for real-time implementation. Simulation and timing results are presented.
Download Differentiable All-Pole Filters for Time-Varying Audio Systems Infinite impulse response filters are an essential building block of many time-varying audio systems, such as audio effects and synthesisers. However, their recursive structure impedes end-toend training of these systems using automatic differentiation. Although non-recursive filter approximations like frequency sampling and frame-based processing have been proposed and widely used in previous works, they cannot accurately reflect the gradient of the original system. We alleviate this difficulty by reexpressing a time-varying all-pole filter to backpropagate the gradients through itself, so the filter implementation is not bound to the technical limitations of automatic differentiation frameworks. This implementation can be employed within audio systems containing filters with poles for efficient gradient evaluation. We demonstrate its training efficiency and expressive capabilities for modelling real-world dynamic audio systems on a phaser, time-varying subtractive synthesiser, and feed-forward compressor. We make our code and audio samples available and provide the trained audio effect and synth models in a VST plugin1 .
Download Interpolation Filters for Antiderivative Antialiasing Aliasing is an inherent problem in nonlinear digital audio processing which results in undesirable audible artefacts. Antiderivative antialiasing has proved to be an effective approach to mitigate aliasing distortion, and is based on continuous-time convolution of a linearly interpolated distorted signal with antialiasing filter kernels. However, the performance of this method is determined by the properties of interpolation filter. In this work, cubic interpolation kernels for antiderivative antialiasing are considered. For memoryless nonlinearities, aliasing reduction is improved employing cubic interpolation. For stateful systems, numerical simulation and stability analysis with respect to different interpolation kernels remain in favour of linear interpolation.
Download Anti-Aliasing of Neural Distortion Effects via Model Fine Tuning Neural networks have become ubiquitous with guitar distortion
effects modelling in recent years. Despite their ability to yield
perceptually convincing models, they are susceptible to frequency
aliasing when driven by high frequency and high gain inputs.
Nonlinear activation functions create both the desired harmonic
distortion and unwanted aliasing distortion as the bandwidth of
the signal is expanded beyond the Nyquist frequency. Here, we
present a method for reducing aliasing in neural models via a
teacher-student fine tuning approach, where the teacher is a pretrained model with its weights frozen, and the student is a copy of
this with learnable parameters. The student is fine-tuned against
an aliasing-free dataset generated by passing sinusoids through
the original model and removing non-harmonic components from
the output spectra.
Our results show that this method significantly suppresses aliasing for both long-short-term-memory networks (LSTM) and temporal convolutional networks (TCN). In the
majority of our case studies, the reduction in aliasing was greater
than that achieved by two times oversampling. One side-effect
of the proposed method is that harmonic distortion components
are also affected.
This adverse effect was found to be modeldependent, with the LSTM models giving the best balance between
anti-aliasing and preserving the perceived similarity to an analog
reference device.
Download Power-Balanced Drift Regulation for Scalar Auxiliary Variable Methods: Application to Real-Time Simulation of Nonlinear String Vibrations Efficient stable integration methods for nonlinear systems are
of great importance for physical modeling sound synthesis. Specifically, a number of musical systems of interest, including vibrating
strings, bars or plates may be written as port-Hamiltonian systems
with quadratic kinetic energy and non-quadratic potential energy.
Efficient schemes have been developed for such systems through
the introduction of a scalar auxiliary variable. As a result, the stable real-time simulations of nonlinear musical systems of up to a
few thousands of degrees of freedom is possible, even for nearly
lossless systems. However, convergence rates can be slow and
seem to be system-dependent. Specifically, at audio rates, they
may suffer from numerical drift of the auxiliary variable, resulting
in dramatic unwanted effects on audio output, such as pitch drifts
after several impacts on the same resonator.
In this paper, a novel method for mitigating this unwanted drift
while preserving power balance is presented, based on a control
approach. A set of modified equations is proposed to control the
drift artefact by rerouting energy through the scalar auxiliary variable and potential energy state. Numerical experiments are run
in order to check convergence on simulations in the case of a cubic nonlinear string. A real-time implementation is provided as
a Max/MSP external. 60-note polyphony is achieved on a laptop, and some simple high level control parameters are provided,
making the proposed implementation suitable for use in artistic
contexts. All code is available in a public repository, along with
compiled Max/MSP externals1.
Download Learning Nonlinear Dynamics in Physical Modelling Synthesis Using Neural Ordinary Differential Equations Modal synthesis methods are a long-standing approach for modelling distributed musical systems. In some cases extensions are
possible in order to handle geometric nonlinearities. One such
case is the high-amplitude vibration of a string, where geometric nonlinear effects lead to perceptually important effects including pitch glides and a dependence of brightness on striking amplitude. A modal decomposition leads to a coupled nonlinear system of ordinary differential equations. Recent work in applied machine learning approaches (in particular neural ordinary differential equations) has been used to model lumped dynamic systems
such as electronic circuits automatically from data. In this work,
we examine how modal decomposition can be combined with neural ordinary differential equations for modelling distributed musical systems. The proposed model leverages the analytical solution
for linear vibration of system’s modes and employs a neural network to account for nonlinear dynamic behaviour. Physical parameters of a system remain easily accessible after the training without
the need for a parameter encoder in the network architecture. As
an initial proof of concept, we generate synthetic data for a nonlinear transverse string and show that the model can be trained to
reproduce the nonlinear dynamics of the system. Sound examples
are presented.