Download On the Dynamics of the Harpsichord and its Synthesis It is common knowledge that the piano was developed to produce a keyboard instrument with a larger dynamic range and higher sound radiation level than the harpsichord possesses. Also, the harpsichord is a plucked string instrument with a very controlled mechanism to excite the string. For these reasons it is often falsely understood that the harpsichord does not exhibit any dynamic variation. On the contrary, the signal analysis and the listening test made in the this study show that minor but audible differences in the dynamic levels exist. The signal analysis portrays that stronger playing forces produce higher levels in harmonics. The energy given by the player is not only distributed to the plucking mechanism but also carried on from the key to the body. This is evident from the increased level of body mode radiation. A synthesis model for approximating the dynamic behavior of the harpsichord is also proposed. It contains gain and timbre control, and a parallel filter structure to simulate the soundboard knock characteristic for high key velocity tones.
Download Musical Sound Timbre: Verbal Description and Dimensions Two approaches to the study of musical sound timbre are described and documented by psychoacoustic experiment examples. The classical bottom-up approach is demonstrated on the study of contexts of violin sounds and pipe organ sounds. Verbal attributes collected during listening tests were used for the interpretation and comparison of resulted perceptual spaces of sounds. The proposed top-down approach is based on the collection of musical experts experiences and opinions going from very common to more specific ones. Here the common perceptual space (perceptual space of verbal attributes) was constructed from nonlistening test of dissimilarity of verbal attributes describing timbre (verbal or soundfree context of stimuli). The verbal interpretation of perceptual spaces of sound contexts and perceptual space of verbal attributes are compared and the hypothesis of the four basic dimensions of timbre is formulated: 1. gloomy — clear, 2. harsh — delicate, 3. full — narrow, 4. noisy — ?.
Download Musical Key Estimation of Audio Signal Based on Hidden Markov Modeling of Chroma Vectors In this paper, we propose a system for the automatic estimation of the key of a music track using hidden Markov models. The front-end of the system performs transient/noise reduction, estimation of the tuning and then represents the track as a succession of chroma vectors over time. The characteristics of the Major and minor modes are learned by training two hidden Markov models on a labeled database. 24 hidden Markov models corresponding to the various keys are then derived from the two trained models. The estimation of the key of a music track is then obtained by computing the likelihood of its chroma sequence given each HMM. The system is evaluated positively using a database of European baroque, classical and romantic music. We compare the results with the ones obtained using a cognitive-based approach. We also compare the chroma-key profiles learned from the database to the cognitive-based ones.
Download Onset Detection Revisited Various methods have been proposed for detecting the onset times of musical notes in audio signals. We examine recent work on onset detection using spectral features such as the magnitude, phase and complex domain representations, and propose improvements to these methods: a weighted phase deviation function and a halfwave rectified complex difference. These new algorithms are compared with several state-of-the-art algorithms from the literature, and these are tested using a standard data set of short excerpts from a range of instruments (1060 onsets), plus a much larger data set of piano music (106054 onsets). Some of the results contradict previously published results and suggest that a similarly high level of performance can be obtained with a magnitude-based (spectral flux), a phase-based (weighted phase deviation) or a complex domain (complex difference) onset detection function.
Download A New Analysis Method for Sinusoids+Noise Spectral Models Existing deterministic+stochastic spectral models assume that the sounds are with low noise levels. The stochastic part of the sound is generally estimated by subtraction of the deterministic part: It is assumed to be the residual. Inevitable errors in the estimation of the parameters of the deterministic part result in errors – often worse – in the estimation of the stochastic part. We propose a new method that avoids these errors. Our method analyzes the stochastic part without any prior knowledge of the deterministic part. It relies on the study of the distribution of the amplitude values in successive short-time spectra. Computations of the statistical moments or the maximum likelihood lead to an estimation of the noise power density. Experimentations on synthetic or natural sounds show that this method is promising.
Download Adaptive Noise Level Estimation We describe a novel algorithm for the estimation of the colored noise level in audio signals with mixed noise and sinusoidal components. The noise envelope model is based on the assumptions that the envelope varies slowly with frequency and that the magnitudes of the noise peaks obey a Rayleigh distribution. Our method is an extension of a recently proposed approach of spectral peak classification of sinusoids and noise, which takes into account a noise envelope model to improve the detection of sinusoidal peaks. By means of iterative evaluation and adaptation of the noise envelope model, the classification of noise and sinusoidal peaks is iteratively refined until the detected noise peaks are coherently explained by the noise envelope model. Testing examples of estimating white noise and colored noise are demonstrated.
Download Categories of Perception for Vibrato, Flange, and Stereo Chorus: Mapping Out the Musically Useful Ranges of Modulation Rate and Depth for Delay-Based Effects Vibrato, Flange, and Stereo Chorus are perhaps the three most often used digital audio effects that are created by smoothly modulating the duration of a delay line at typically sub-audio rates. Common practice is to use a periodic or quasi-periodic modulation control signal with frequency roughly between 2 and 9 Hz, and both the rate and depth of delay modulation are typically adjusted according to the aesthetic criteria of a performer or by an audio production engineer. In order to establish norms for the musically useful range of modulation rate and depth for such delay-based effects, 25 listeners were asked to make categorical judgments regarding their perception of vibrato, flange, and stereo chorus effects. The results map out for these two modulation parameters three perceptual regions for these three related effects: the region in which modulation is too subtle for effective use, the parameter ranges that seem most musically useful, and the region in which it is too extreme for most musical applications. Of particular interest is the observed commonality between these perceptual regions for vibrato, flange, and stereo chorus effects.
Download Inter Genre Similarity Modeling for Automatic Music Genre Classification Music genre classification is an essential tool for music information retrieval systems and it has been finding critical applications in various media platforms. Two important problems of the automatic music genre classification are feature extraction and classifier design. This paper investigates inter-genre similarity modelling (IGS) to improve the performance of automatic music genre classification. Inter-genre similarity information is extracted over the mis-classified feature population. Once the inter-genre similarity is modelled, elimination of the inter-genre similarity reduces the inter-genre confusion and improves the identification rates. Inter-genre similarity modelling is further improved with iterative IGS modelling(IIGS) and score modelling for IGS elimination(SMIGS). Experimental results with promising classification improvements are provided.
Download Variable Pre-Emphasis LPC for Modeling Vocal Effort in the Singing Voice In speech and singing, the spectral envelope of the glottal source varies according to different voice qualities such as vocal effort, lax voice, and breathy voice. In contrast, linear prediction coding (LPC) models the glottal source in a way that is not flexible. The spectral envelope of the source estimated by LPC is fixed and determined by the pre-emphasis filter. In standard LPC, the formant filter captures variation in the spectral envelope that should be associated with the source. This paper presents variable preemphasis LPC (VPLPC) as a technique to allow the estimated source to vary. This results in formant filters that remain more consistent across variations in vocal effort and breathiness. VPLPC also provides a way to change the envelope of the estimated source, thereby changing the perception of vocal effort. The VPLPC algorithm is used to manipulate some voice excerpts with promising but mixed results. Possible improvements are suggested.
Download Frequency-Dependent Boundary Condition for the 3-D Digital Waveguide Mesh The three-dimensional digital waveguide mesh is a method for modeling the propagation of sound waves in space. It provides a simulation of the state of the whole soundfield at discrete timesteps. The updating functions of the mesh can be formulated either using physical values of sound pressure or particle velocity, also called the Kirchhoff values, or using a wave decomposition of these instead. Computation in homogenous media is significantly lighter using Kirchhoff variables, but frequency-dependent boundary conditions are more easily defined with wave variables. In this paper a conversion method between these two variable types has been further simplified. Using the resulting structure, a novel method for defining the mesh boundaries with digital filters is introduced. With this new method, the reflection coefficients can be defined in a frequency-dependent manner at the boundaries of a Kirchhoff variable mesh. This leads to computationally lighter and more realistic simulations than previous solutions.