Download Analysing auditory representations for sound classification with self-organising neural networks Three different auditory representations—Lyon’s cochlear model, Patterson’s gammatone filterbank combined with Meddis’ inner hair cell model, and mel-frequency cepstral coefficients—are analyzed in connection with self-organizing maps to evaluate their suitability for a perceptually justified classification of sounds. The self-organizing maps are trained with a uniform set of test sounds preprocessed by the auditory representations. The structure of the resulting feature maps and the trajectories of the individual sounds are visualized and compared to one another. While MFCC proved to be a very efficient representation, the gammatone model produced the most convincing results.
Download Visualization and calculation of the roughness of acoustical musical signals using the Synchronization Index Model (SIM) The synchronization index model of sensory dissonance and roughness accounts for the degree of phase-locking to a particular frequency that is present in the neural patterns. Sensory dissonance (roughness) is defined as the energy of the relevant beating frequencies in the auditory channels with respect to the total energy. The model takes rate-code patterns at the level of the auditory nerve as input and outputs a sensory dissonance (roughness) value. The synchronization index model entails a straightforward visualization of the principles underlying sensory dissonance and roughness, in particular in terms of (i) roughness contributions with respect to cochlear mechanical filtering (on a Critical Band scale), and (ii) roughness contributions with respect to phase-locking synchrony (=the synchronization index for the relevant beating frequencies on a frequency scale). This paper presents the concept, and implementation of the synchronization index model and its application to musical scales.
Download The best of two worlds: retrieving and browsing This paper describes the combination of two software systems for work with music corpora in electronic formats. A set of algorithms has been developed in CPN View (a class library for representing music scores) that deals with music score processing. These facilitate access to the ever-increasing collections of music corpora [1]. The Sonic Browser (a browser that uses sonic spatialization for navigating music or sound databases) has been developed to the proof-of-concept and prototype implementation stage. In previous work it has been demonstrated that with the Sonic Browser it is up to 28% faster for users to find a particular melody in a set of melodies, compared to visual browsing [2].
Download Blackboard system and top-down processing for the transcription of simple polyphonic music A system is proposed to perform the automatic music transcription of simple polyphonic tracks using top-down processing. It is composed of a blackboard system of three hierarchical levels, receiving its input from a segmentation routine in the form of an averaged STFT matrix. The blackboard contains a hypotheses database, a scheduler and knowledge sources, one of which is a neural network chord recogniser with the ability to reconfigure the operation of the system, allowing it to output more than one note hypothesis at a time. The basic implementation is explained, and some examples are provided to illustrate the performance of the system. The weaknesses of the current implementation are shown and next steps for further development of the system are defined.
Download Robust multipich estimation for the analysis and manipulation of polyphonic musical signals A method for the estimation of the multiple pitches of concurrent musical sounds is described. Experimental data comprised sung vowels and the whole pitch range of 26 musical instruments. Multipitch estimation was performed at the level of a single time frame for random pitch and sound source combinations. Note error rates for mixtures ranging from one to six simultaneous sounds were 2.1 %, 2.4 %, 3.8 %, 8.1 %, 12 %, and 18 %, respectively. In musical interval and chord identification tasks, the algorithm outperformed the average of ten trained musicians. Particular emphasis was laid on robustness in the presence of other sounds and noise. The algorithm is based on an iterative estimation and separation procedure and is able to resolve at least a couple of most prominent pitches even in ten sound polyphonies. Sounds that exhibit inharmonicities can be handled without problems, and the inharmonicity factor and spectral envelope of each sound is estimated along with the pitch. Examples are given of musical signal manipulations that become possible with the proposed method.
Download On the use of zero-crossing rate for an apllication of classification of percussive sounds We address the issue of automatically extracting rhythm descriptors from audio signals, to be eventually used in content-based musical applications such as in the context of MPEG7. Our aim is to approach the comprehension of auditory scenes in raw polyphonic audio signals without preliminary source separation. As a first step towards the automatic extraction of rhythmic structures out of signals taken from the popular music repertoire, we propose an approach for automatically extracting time indexes of occurrences of different percussive timbres in an audio signal. Within this framework, we found that a particular issue lies in the classification of percussive sounds. In this paper, we report on the method currently used to deal with this problem.
Download Estimating the plucking point on a guitar string This paper presents a frequency-domain technique for estimating the plucking point on a guitar string from an acoustically recorded signal. It also includes an original method for detecting the fingering point, based on the plucking point information.
Download A simplified approach to high quality music and sound over IP Present systems for streaming digital audio between devices connected by internet have been limited by a number of compromises. Because of restricted bandwidth and “best effort” delivery, signal compression of one form or another is typical. Buffering of audio data which is needed to safeguard against delivery uncertainties can cause signal delays of seconds. Audio is in general an unforgiving test of networking, e.g., one data packet arriving too late and we hear it. Trade-offs of signal quality have been necessary to avoid this basic fact and until now, have vied against serious musical uses. Beginning in late 1998, audio applications specifically designed for next-generation networks were initiated that could meet the stringent requirements of professional-quality music streaming. A related experiment was begun to explore the use of audio as a network measurement tool. SoundWIRE (sound waves over the internet from real-time echoes) creates a sonar-like ping to display to the ear qualities of bidirectional connections. Recent experiments have achieved coast-to-coast sustained audio connections whose round trip times are within a factor of 2 of the speed of light. Full-duplex speech over these connections feels comfortable and in an IIR recirculating form that creates echoes like SoundWIRE, users can experience singing into a transcontinental echo chamber. Three simplifications to audio streaming are suggested in this paper: Compression has been eliminated to reduce delay and enhance signal-quality. TCP/IP is used in unidirectional flows for its delivery guarantees and thereby eliminating the need for application software to correct transmission errors. QoS puts bounds on latency and jitter affecting long-haul bidirectional flows.
Download Modelling digital musical effects for signal processors, based on real effect manifestation analysis For quite some time in the area of commercial utilization of digital audio effects, efforts have emerged to create simulate by software analog effects and effect processors This paper deals with the analysis of musical effects, the design of algorithms for simulating these effects, and their realization on both digital signal processors and the PC platform in the form of plug-in modules for the DirectX environment. It also deals with the problem of controlling the effect parameters and with subjective testing of algorithms, and it examines the fidelity of simulated effects as compared with the original.