Bayesian distance metric learning on i-vector for speaker verification

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2013. Cataloged from PDF version of thesis. Includes bibliographical references (pages 63-66). This thesis explores the use of Bayesian distance metric learning (Bayes_dml) for the task of spe...

Full description

Bibliographic Details
Main Author: Fang, Xiao, Ph. D. Massachusetts Institute of Technology
Other Authors: James R. Glass and Najim Dehak., Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Format: Thesis
Language:English
Published: Massachusetts Institute of Technology 2013
Subjects:
DML
Online Access:http://hdl.handle.net/1721.1/84870
id ftmit:oai:dspace.mit.edu:1721.1/84870
record_format openpolar
spelling ftmit:oai:dspace.mit.edu:1721.1/84870 2023-06-11T04:11:18+02:00 Bayesian distance metric learning on i-vector for speaker verification Fang, Xiao, Ph. D. Massachusetts Institute of Technology James R. Glass and Najim Dehak. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. 2013 66 pages application/pdf http://hdl.handle.net/1721.1/84870 eng eng Massachusetts Institute of Technology http://hdl.handle.net/1721.1/84870 868330663 M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 Electrical Engineering and Computer Science Thesis 2013 ftmit 2023-05-29T08:37:30Z Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2013. Cataloged from PDF version of thesis. Includes bibliographical references (pages 63-66). This thesis explores the use of Bayesian distance metric learning (Bayes_dml) for the task of speaker verification using the i-vector feature representation. We propose a framework that explores the distance constraints between i-vector pairs from the same speaker and different speakers. With an approximation of the distance metric as a weighted covariance matrix of the top eigenvectors from the data covariance matrix, variational inference is used to estimate a posterior distribution of the distance metric. Given speaker labels, we select different-speaker data pairs with the highest cosine scores to form a different-speaker constraint set. This set captures the most discriminative between-speaker variability that exists in the training data. This system is evaluated on the female part of the 2008 NIST SRE dataset. Cosine similarity scoring, as the state-of-the-art approach, is compared to Bayes-dml. Experimental results show the comparable performance between Bayes_dml and cosine similarity scoring. Furthermore, Bayes-dml is insensitive to score normalization, as compared to cosine similarity scoring. Without the requirement of the number of labeled examples, Bayes_dml performs better in the context of limited training data by Xiao Fang. S.M. Thesis DML DSpace@MIT (Massachusetts Institute of Technology) Fang ENVELOPE(167.217,167.217,-77.483,-77.483)
institution Open Polar
collection DSpace@MIT (Massachusetts Institute of Technology)
op_collection_id ftmit
language English
topic Electrical Engineering and Computer Science
spellingShingle Electrical Engineering and Computer Science
Fang, Xiao, Ph. D. Massachusetts Institute of Technology
Bayesian distance metric learning on i-vector for speaker verification
topic_facet Electrical Engineering and Computer Science
description Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2013. Cataloged from PDF version of thesis. Includes bibliographical references (pages 63-66). This thesis explores the use of Bayesian distance metric learning (Bayes_dml) for the task of speaker verification using the i-vector feature representation. We propose a framework that explores the distance constraints between i-vector pairs from the same speaker and different speakers. With an approximation of the distance metric as a weighted covariance matrix of the top eigenvectors from the data covariance matrix, variational inference is used to estimate a posterior distribution of the distance metric. Given speaker labels, we select different-speaker data pairs with the highest cosine scores to form a different-speaker constraint set. This set captures the most discriminative between-speaker variability that exists in the training data. This system is evaluated on the female part of the 2008 NIST SRE dataset. Cosine similarity scoring, as the state-of-the-art approach, is compared to Bayes-dml. Experimental results show the comparable performance between Bayes_dml and cosine similarity scoring. Furthermore, Bayes-dml is insensitive to score normalization, as compared to cosine similarity scoring. Without the requirement of the number of labeled examples, Bayes_dml performs better in the context of limited training data by Xiao Fang. S.M.
author2 James R. Glass and Najim Dehak.
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
format Thesis
author Fang, Xiao, Ph. D. Massachusetts Institute of Technology
author_facet Fang, Xiao, Ph. D. Massachusetts Institute of Technology
author_sort Fang, Xiao, Ph. D. Massachusetts Institute of Technology
title Bayesian distance metric learning on i-vector for speaker verification
title_short Bayesian distance metric learning on i-vector for speaker verification
title_full Bayesian distance metric learning on i-vector for speaker verification
title_fullStr Bayesian distance metric learning on i-vector for speaker verification
title_full_unstemmed Bayesian distance metric learning on i-vector for speaker verification
title_sort bayesian distance metric learning on i-vector for speaker verification
publisher Massachusetts Institute of Technology
publishDate 2013
url http://hdl.handle.net/1721.1/84870
long_lat ENVELOPE(167.217,167.217,-77.483,-77.483)
geographic Fang
geographic_facet Fang
genre DML
genre_facet DML
op_relation http://hdl.handle.net/1721.1/84870
868330663
op_rights M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.
http://dspace.mit.edu/handle/1721.1/7582
_version_ 1768386244825317376