DAFx Paper Archive - Browse all papers, page 135 of 162

FDNTB: The Feedback Delay Network Toolbox

DAFx-2020 - Vienna (virtual)

Feedback delay networks (FDNs) are recursive filters, which are widely used for artificial reverberation and decorrelation. While there exists a vast literature on a wide variety of reverb topologies, this work aims to provide a unifying framework to design and analyze delay-based reverberators. To this end, we present the Feedback Delay Network Toolbox (FDNTB), a collection of the MATLAB functions and example scripts. The FDNTB includes various representations of FDNs and corresponding translation functions. Further, it provides a selection of special feedback matrices, topologies, and attenuation filters. In particular, more advanced algorithms such as modal decomposition, time-varying matrices, and filter feedback matrices are readily accessible. Furthermore, our toolbox contains several additional FDN designs. Providing MATLAB code under a GNU-GPL 3.0 license and including illustrative examples, we aim to foster research and education in the field of audio processing.

Download

Velvet-Noise Feedback Delay Network

Jon Fagerström; Benoit Alary; Sebastian J. Schlecht; Vesa Välimäki

DAFx-2020 - Vienna (virtual)

Artificial reverberation is an audio effect used to simulate the acoustics of a space while controlling its aesthetics, particularly on sounds recorded in a dry studio environment. Delay-based methods are a family of artificial reverberators using recirculating delay lines to create this effect. The feedback delay network is a popular delay-based reverberator providing a comprehensive framework for parametric reverberation by formalizing the recirculation of a set of interconnected delay lines. However, one known limitation of this algorithm is the initial slow build-up of echoes, which can sound unrealistic, and overcoming this problem often requires adding more delay lines to the network. In this paper, we study the effect of adding velvet-noise filters, which have random sparse coefficients, at the input and output branches of the reverberator. The goal is to increase the echo density while minimizing the spectral coloration. We compare different variations of velvet-noise filtering and show their benefits. We demonstrate that with velvet noise, the echo density of a conventional feedback delay network can be exceeded using half the number of delay lines and saving over 50% of computing operations in a practical configuration using low-order attenuation filters.

Download

Fade-in Control for Feedback Delay Networks

Nils Meyer-Kahlen; Sebastian J. Schlecht; Tapio Lokki

DAFx-2020 - Vienna (virtual)

In virtual acoustics, it is common to simulate the early part of a Room Impulse Response using approaches from geometrical acoustics and the late part using Feedback Delay Networks (FDNs). In order to transition from the early to the late part, it is useful to slowly fade-in the FDN response. We propose two methods to control the fade-in, one based on double decays and the other based on modal beating. We use modal analysis to explain the two concepts for incorporating this fade-in behaviour entirely within the IIR structure of a multiple input multiple output FDN. We present design equations, which allow for placing the fade-in time at an arbitrary point within its derived limit.

Download

Delay Network Architectures for Room and Coupled Space Modeling

Orchisama Das; Jonathan Abel; Elliot K. Canfield-Dafilou

DAFx-2020 - Vienna (virtual)

Feedback delay network reverberators have decay filters associated with each delay line to model the frequency dependent reverberation time (T60) of a space. The decay filters are typically designed such that all delay lines independently produce the same T60 frequency response. However, in real rooms, there are multiple, concurrent T60 responses that depend on the geometry and physical properties of the materials present in the rooms. In this paper, we propose the Grouped Feedback Delay Network (GFDN), where groups of delay lines share different target T60s. We use the GFDN to simulate coupled rooms, where one room is significantly larger than the other. We also simulate rooms with different materials, with unique decay filters associated with each delay line group, designed to represent the T60 characteristics of a particular material. The T60 filters are designed to emulate the materials’ absorption characteristics with minimal computation. We discuss the design of the mixing matrix to control inter- and intra-group mixing, and show how the amount of mixing affects behavior of the room modes. Finally, we discuss the inclusion of air absorption filters on each delay line and physically motivated room resizing techniques with the GFDN.

Download

Energy-Preserving Time-Varying Schroeder Allpass Filters

Kurt James Werner

DAFx-2020 - Vienna (virtual)

In artificial reverb algorithms, gains are commonly varied over time to break up temporal patterns, improving quality. We propose a family of novel Schroeder-style allpass filters that are energypreserving under arbitrary, continuous changes of their gains over time. All of them are canonic in delays, and some are also canonic in multiplies. This yields several structures that are novel even in the time-invariant case. Special cases for cascading and nesting these structures with a reduced number of multipliers are shown as well. The proposed structures should be useful in artificial reverb applications and other time-varying audio effects based on allpass filters, especially where allpass filters are embedded in feedback loops and stability may be an issue.

Download

Perceptual Evaluation of Mitigation Approaches of Impairments Due to Spatial Undersampling in Binaural Rendering of Spherical Microphone Array Data: Dry Acoustic Environments

Tim Lübeck; Hannes Helmholz; Johannes Mathias Arend; Christoph Pörschmann; Jens Ahrens

DAFx-2020 - Vienna (virtual)

Employing a finite number of discrete microphones, instead of a continuous distribution according to theory, reduces the physical accuracy of sound field representations captured by a spherical microphone array. For a binaural reproduction of the sound field, a number of approaches have been proposed in the literature to mitigate the perceptual impairment when the captured sound fields are reproduced binaurally. We recently presented a perceptual evaluation of a representative set of approaches in conjunction with reverberant acoustic environments. This paper presents a similar study but with acoustically dry environments with reverberation times of less than 0.25 s. We examined the Magnitude Least-Squares algorithm, the Bandwidth Extraction Algorithm for Microphone Arrays, Spherical Head Filters, spherical harmonics Tapering, and Spatial Subsampling, all up to a spherical harmonics order of 7. Although dry environments violate some of the assumptions underlying some of the approaches, we can confirm the results of our previous study: Most approaches achieve an improvement whereby the magnitude of the improvement is comparable across approaches and acoustic environments.

Download

Interaural Cues Cartography: Localization Cues Repartition for Three Spatialization Methods

Eric Méaux; Sylvain Marchand

DAFx-2020 - Vienna (virtual)

The Synthetic Transaural Audio Rendering (STAR) method, first introduced at DAFx-06 then enhanced at DAFx-19, is a perceptive approach for sound spatialization aiming at reproducing the acoustic cues at the ears of the listener, using loudspeakers. To validate the method, several comparisons with state-of-the-art spatialization methods (VBAP and HOA) were conducted. Previously, quality comparisons with human subjects have been made, providing meaningful subjective results in real conditions. In this article an objective comparison is proposed, using acoustic cues error maps. The cartography enables us to study the spatialization effect in a 2D space, for a listening position within an audience, and thus not necessarily located at the center. Two approaches are conducted: the first simulates the binaural signals for a virtual KEMAR manikin, in ideal conditions and with a fine resolution; the second records these binaural signals using a real KEMAR manikin, providing real data with reverberation, though with a coarser resolution. In both cases the acoustic cues were derived from the binaural signals (either simulated or measured), and compared to the reference value taken at the center of the octophonic loudspeakers configuration. The obtained error maps display comforting results, our STAR method producing the smallest error for both simulated and experimental conditions.

Download

Neural Parametric Equalizer Matching Using Differentiable Biquads

Shahan Nercessian

DAFx-2020 - Vienna (virtual)

This paper proposes a neural network for carrying out parametric equalizer (EQ) matching. The novelty of this neural network solution is that it can be optimized directly in the frequency domain by means of differentiable biquads, rather than relying solely on a loss on parameter values which does not correlate directly with the system output. We compare the performance of the proposed neural network approach with that of a baseline algorithm based on a convex relaxation of the problem. It is observed that the neural network can provide better matching than the baseline approach because it directly attempts to solve the non-convex problem. Moreover, we show that the same network trained with only a parameter loss is insufficient for the task, despite the fact that it matches underlying EQ parameters better than one trained with a combination of spectral and parameter losses.

Download

Relative Music Loudness Estimation Using Temporal Convolutional Networks and a CNN Feature Extraction Front-End

Blai Meléndez-Catalán; Emilio Molina; Emilia Gómez

DAFx-2020 - Vienna (virtual)

Relative music loudness estimation is a MIR task that consists in dividing audio in segments of three classes: Foreground Music, Background Music and No Music. Given the temporal correlation of music, in this work we approach the task using a type of network with the ability to model temporal context: the Temporal Convolutional Network (TCN). We propose two architectures: a TCN, and a novel architecture resulting from the combination of a TCN with a Convolutional Neural Network (CNN) front-end. We name this new architecture CNN-TCN. We expect the CNN front-end to work as a feature extraction strategy to achieve a more efficient usage of the network’s parameters. We use the OpenBMAT dataset to train and test 40 TCN and 80 CNN-TCN models with two grid searches over a set of hyper-parameters. We compare our models with the two best algorithms submitted to the tasks of music detection and relative music loudness estimation in MIREX 2019. All our models outperform the MIREX algorithms even when using a lower number of parameters. The CNN-TCN emerges as the best architecture as all its models outperform all TCN models. We show that adding a CNN front-end to a TCN can actually reduce the number of parameters of the network while improving performance. The CNN front-end effectively works as a feature extractor producing consistent patterns that identify different combinations of music and non-music sounds and also helps in producing a smoother output in comparison to the TCN models.

Download

Neural Modelling of Time-Varying Effects

Alec Wright; Vesa Välimäki

DAFx-2020 - Vienna (virtual)

This paper proposes a grey-box neural network based approach to modelling LFO modulated time-varying effects. The neural network model receives both the unprocessed audio, as well as the LFO signal, as input. This allows complete control over the model’s LFO frequency and shape. The neural networks are trained using guitar audio, which has to be processed by the target effect and also annotated with the predicted LFO signal before training. A measurement signal based on regularly spaced chirps was used to accurately predict the LFO signal. The model architecture has been previously shown to be capable of running in real-time on a modern desktop computer, whilst using relatively little processing power. We validate our approach creating models of both a phaser and a flanger effects pedal, and theoretically it can be applied to any LFO modulated time-varying effect. In the best case, an errorto-signal ratio of 1.3% is achieved when modelling a flanger pedal, and previous work has shown that this corresponds to the model being nearly indistinguishable from the target device.

Download

Proceedings of the International Conference on Digital Audio Effects (DAFx)

Proc. Int. Conf. Digital Audio Effects (DAFx)

Paper Archive

Years

Authors