UNSUPERVISED VALIDITY MEASURES FOR VOCALIZATION CLUSTERING

This paper describes unsupervised speech/speaker cluster validity measures based on a dissimilarity metric, for the purpose of estimating the number of clusters in a speech data set as well as assessing the consistency of the clustering procedure. The number of clusters is estimated by minimizing th...

Full description

Bibliographic Details
Main Authors: Kuntoro Adi, Kristine E. Sonstrom, Peter M. Scheifele, Michael T. Johnson
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Subjects:
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.329.3268
http://speechlab.eece.mu.edu/johnson/papers/adi_icassp08.pdf
Description
Summary:This paper describes unsupervised speech/speaker cluster validity measures based on a dissimilarity metric, for the purpose of estimating the number of clusters in a speech data set as well as assessing the consistency of the clustering procedure. The number of clusters is estimated by minimizing the cross-data dissimilarity values, while algorithm consistency is evaluated by calculating the dissimilarity values across multiple experimental runs. The method is demonstrated on the task of Beluga whale vocalization clustering. Index Terms — speech/speaker clustering, unsupervised validity, dissimilarity value, validation of classifiers. 1.