Download Aliasing Reduction in Neural Amp Modeling by Smoothing Activations
The increasing demand for high-quality digital emulations of analog audio hardware, such as vintage tube guitar amplifiers, led to numerous works on neural network-based black-box modeling, with deep learning architectures like WaveNet showing promising results. However, a key limitation in all of these models was the aliasing artifacts stemming from nonlinear activation functions in neural networks. In this paper, we investigated novel and modified activation functions aimed at mitigating aliasing within neural amplifier models. Supporting this, we introduced a novel metric, the Aliasing-to-Signal Ratio (ASR), which quantitatively assesses the level of aliasing with high accuracy. Measuring also the conventional Error-to-Signal Ratio (ESR), we conducted studies on a range of preexisting and modern activation functions with varying stretch factors. Our findings confirmed that activation functions with smoother curves tend to achieve lower ASR values, indicating a noticeable reduction in aliasing. Notably, this improvement in aliasing reduction was achievable without a substantial increase in ESR, demonstrating the potential for high modeling accuracy with reduced aliasing in neural amp models.
Download Antialiased Black-Box Modeling of Audio Distortion Circuits Using Real Linear Recurrent Units
In this paper, we propose the use of real-valued Linear Recurrent Units (LRUs) for black-box modeling of audio circuits. A network architecture composed of real LRU blocks interleaved with nonlinear processing stages is proposed. Two case studies are presented, a second-order diode clipper and an overdrive distortion pedal. Furthermore, we show how to integrate the antiderivative antialiaisng technique into the proposed method, effectively lowering oversampling requirements. Our experiments show that the proposed method generates models that accurately capture the nonlinear dynamics of the examined devices and are highly efficient, which makes them suitable for real-time operation inside Digital Audio Workstations.
Download Training Neural Models of Nonlinear Multi-Port Elements Within Wave Digital Structures Through Discrete-Time Simulation
Neural networks have been applied within the Wave Digital Filter (WDF) framework as data-driven models for nonlinear multi-port circuit elements. Conventionally, these models are trained on wave variables obtained by sampling the current-voltage characteristic of the considered nonlinear element before being incorporated into the circuit WDF implementation. However, isolating multi-port elements for this process can be challenging, as their nonlinear behavior often depends on dynamic effects that emerge from interactions with the surrounding circuit. In this paper, we propose a novel approach for training neural models of nonlinear multi-port elements directly within a circuit’s Wave Digital (WD) discretetime implementation, relying solely on circuit input-output voltage measurements. Exploiting the differentiability of WD simulations, we embed the neural network into the simulation process and optimize its parameters using gradient-based methods by minimizing a loss function defined over the circuit output voltage. Experimental results demonstrate the effectiveness of the proposed approach in accurately capturing the nonlinear circuit behavior, while preserving the interpretability and modularity of WDFs.
Download Distributed Single-Reed Modeling Based on Energy Quadratization and Approximate Modal Expansion
Recently, energy quadratization and modal expansion have become popular methods for developing efficient physics-based sound synthesis algorithms. These methods have been primarily used to derive explicit schemes modeling the collision between a string and a fixed barrier. In this paper, these techniques are applied to a similar problem: modeling a distributed mouthpiece lay-reed-lip interaction in a woodwind instrument. The proposed model aims to provide a more accurate representation of how a musician’s embouchure affects the reed’s dynamics. The mouthpiece and lip are modeled as distributed static and dynamic viscoelastic barriers, respectively. The reed is modeled using an approximate modal expansion derived via the Rayleigh-Ritz method. The reed system is then acoustically coupled to a measured input impedance response of a saxophone. Numerical experiments are presented.
Download A Wavelet-Based Method for the Estimation of Clarity of Attack Parameters in Non-Percussive Instruments
From the exploration of databases of instrument sounds to the selfassisted practice of musical instruments, methods for automatically and objectively assessing the quality of musical tones are in high demand. In this paper, we develop a new algorithm for estimating the duration of the attack, with particular attention to wind and bowed string instruments. In fact, for these instruments, the quality of the tones is highly influenced by the attack clarity, for which, together with pitch stability, the attack duration is an indicator often used by teachers by ear. Since the direct estimation of the attack duration from sounds is made difficult by the initial preponderance of the excitation noise, we propose a more robust approach based on the separation of the ensemble of the harmonics from the excitation noise, which is obtained by means of an improved pitchsynchronous wavelet transform. We also define a new parameter, the noise ducking time, which is relevant for detecting the extent of the noise component in the attack. In addition to the exploration of available sound databases, for testing our algorithm, we created an annotated data set in which several problematic sounds are included. Moreover, to check the consistency and robustness of our duration estimates, we applied our algorithm to sets of synthetic sounds with noisy attacks of programmable duration.
Download Non-Iterative Numerical Simulation in Virtual Analog: A Framework Incorporating Current Trends
For their low and constant computational cost, non-iterative methods for the solution of differential problems are gaining popularity in virtual analog provided their stability properties and accuracy level afford their use at no exaggerate temporal oversampling. At least in some application case studies, one recent family of noniterative schemes has shown promise to outperform methods that achieve accurate results at the cost of iterating several times while converging to the numerical solution. Here, this family is contextualized and studied against known classes of non-iterative methods. The results from these studies foster a more general discussion about the possibilities, role and prospective use of non-iterative methods in virtual analog.
Download Power-Balanced Drift Regulation for Scalar Auxiliary Variable Methods: Application to Real-Time Simulation of Nonlinear String Vibrations
Efficient stable integration methods for nonlinear systems are of great importance for physical modeling sound synthesis. Specifically, a number of musical systems of interest, including vibrating strings, bars or plates may be written as port-Hamiltonian systems with quadratic kinetic energy and non-quadratic potential energy. Efficient schemes have been developed for such systems through the introduction of a scalar auxiliary variable. As a result, the stable real-time simulations of nonlinear musical systems of up to a few thousands of degrees of freedom is possible, even for nearly lossless systems. However, convergence rates can be slow and seem to be system-dependent. Specifically, at audio rates, they may suffer from numerical drift of the auxiliary variable, resulting in dramatic unwanted effects on audio output, such as pitch drifts after several impacts on the same resonator. In this paper, a novel method for mitigating this unwanted drift while preserving power balance is presented, based on a control approach. A set of modified equations is proposed to control the drift artefact by rerouting energy through the scalar auxiliary variable and potential energy state. Numerical experiments are run in order to check convergence on simulations in the case of a cubic nonlinear string. A real-time implementation is provided as a Max/MSP external. 60-note polyphony is achieved on a laptop, and some simple high level control parameters are provided, making the proposed implementation suitable for use in artistic contexts. All code is available in a public repository, along with compiled Max/MSP externals1.
Download Fast Differentiable Modal Simulation of Non-Linear Strings, Membranes, and Plates
Modal methods for simulating vibrations of strings, membranes, and plates are widely used in acoustics and physically informed audio synthesis. However, traditional implementations, particularly for non-linear models like the von Kármán plate, are computationally demanding and lack differentiability, limiting inverse modelling and real-time applications. We introduce a fast, differentiable, GPU-accelerated modal framework built with the JAX library, providing efficient simulations and enabling gradientbased inverse modelling. Benchmarks show that our approach significantly outperforms CPU and GPU-based implementations, particularly for simulations with many modes. Inverse modelling experiments demonstrate that our approach can recover physical parameters, including tension, stiffness, and geometry, from both synthetic and experimental data. Although fitting physical parameters is more sensitive to initialisation compared to methods that fit abstract spectral parameters, it provides greater interpretability and more compact parameterisation. The code is released as open source to support future research and applications in differentiable physical modelling and sound synthesis.
Download Learning Nonlinear Dynamics in Physical Modelling Synthesis Using Neural Ordinary Differential Equations
Modal synthesis methods are a long-standing approach for modelling distributed musical systems. In some cases extensions are possible in order to handle geometric nonlinearities. One such case is the high-amplitude vibration of a string, where geometric nonlinear effects lead to perceptually important effects including pitch glides and a dependence of brightness on striking amplitude. A modal decomposition leads to a coupled nonlinear system of ordinary differential equations. Recent work in applied machine learning approaches (in particular neural ordinary differential equations) has been used to model lumped dynamic systems such as electronic circuits automatically from data. In this work, we examine how modal decomposition can be combined with neural ordinary differential equations for modelling distributed musical systems. The proposed model leverages the analytical solution for linear vibration of system’s modes and employs a neural network to account for nonlinear dynamic behaviour. Physical parameters of a system remain easily accessible after the training without the need for a parameter encoder in the network architecture. As an initial proof of concept, we generate synthetic data for a nonlinear transverse string and show that the model can be trained to reproduce the nonlinear dynamics of the system. Sound examples are presented.
Download Physics-Informed Deep Learning for Nonlinear Friction Model of Bow-String Interaction
This study investigates the use of an unsupervised, physicsinformed deep learning framework to model a one-degree-offreedom mass-spring system subjected to a nonlinear friction bow force and governed by a set of ordinary differential equations. Specifically, it examines the application of Physics-Informed Neural Networks (PINNs) and Physics-Informed Deep Operator Networks (PI-DeepONets). Our findings demonstrate that PINNs successfully address the problem across different bow force scenarios, while PI-DeepONets perform well under low bow forces but encounter difficulties at higher forces. Additionally, we analyze the Hessian eigenvalue density and visualize the loss landscape. Overall, the presence of large Hessian eigenvalues and sharp minima indicates highly ill-conditioned optimization. These results underscore the promise of physics-informed deep learning for nonlinear modelling in musical acoustics, while also revealing the limitations of relying solely on physics-based approaches to capture complex nonlinearities. We demonstrate that PI-DeepONets, with their ability to generalize across varying parameters, are well-suited for sound synthesis. Furthermore, we demonstrate that the limitations of PI-DeepONets under higher forces can be mitigated by integrating observation data within a hybrid supervised-unsupervised framework. This suggests that a hybrid supervised-unsupervised DeepONets framework could be a promising direction for future practical applications.