On the evaluation of perceptual similarity measures for music

Elias Pampalk; Simon Dixon; Gerhard Widmer
DAFx-2003 - London
Several applications in the field of content-based interaction with music repositories rely on measures which estimate the perceived similarity of music. These applications include automatic genre recognition, playlist generation, and recommender systems. In this paper we study methods to evaluate the performance of such measures. We compare five measures which use only the information extracted from the audio signal and discuss how these measures can be evaluated qualitatively and quantitatively without resorting to large scale listening tests.