Unsupervised Audio Key and Chord Recognition
This paper presents a new methodology for determining chords of a music piece without using training data. Specifically, we introduce: 1) a wavelet-based audio denoising component to enhance a chroma-based feature extraction framework, 2) an unsupervised key recognition component to extract a bag of local keys, 3) a chord recognizer using estimated local keys to adjust the chromagram based on a set of well-known tonal profiles to recognize chords on a frame-by-frame basis. We aim to recognize 5 classes of chords (major, minor, diminished, augmented, suspended) and 1 N (no chord or silence). We demonstrate the performance of the proposed approach using 175 Beatles’ songs which we achieved 75% in F-measure for estimating a bag of local keys and at least 68.2% accuracy on chords without discarding any audio segments or the use of other musical elements. The experimental results also show that the wavelet-based denoiser improves the chord recognition rate by approximately 4% over that of other chroma features.