Synthetic References for Template-based ASR using Posterior Features

Recently, the use of phoneme class-conditional probabilities as features (posterior features) for template-based ASR has been proposed. These features have been found to generalize well to unseen data and yield better systems than standard spectral-based features. In this paper, motivated by the hig...

Full description

Bibliographic Details
Published in:	Interspeech 2012
Main Authors:	Soldo, Serena, Magimai.-Doss, Mathew, Bourlard, Hervé
Format:	Text
Language:	unknown
Published:	2013
Subjects:	Arctic
Online Access:	https://doi.org/10.21437/Interspeech.2012-573 https://infoscience.epfl.ch/record/192642/files/Soldo_INTERSPEECH_2012.pdf http://infoscience.epfl.ch/record/192642

id	ftinfoscience:oai:infoscience.tind.io:192642
record_format	openpolar
spelling	ftinfoscience:oai:infoscience.tind.io:192642 2023-05-15T15:03:10+02:00 Synthetic References for Template-based ASR using Posterior Features Soldo, Serena Magimai.-Doss, Mathew Bourlard, Hervé 2013-12-19T17:29:49Z https://doi.org/10.21437/Interspeech.2012-573 https://infoscience.epfl.ch/record/192642/files/Soldo_INTERSPEECH_2012.pdf http://infoscience.epfl.ch/record/192642 unknown doi:10.21437/Interspeech.2012-573 https://infoscience.epfl.ch/record/192642/files/Soldo_INTERSPEECH_2012.pdf http://infoscience.epfl.ch/record/192642 http://infoscience.epfl.ch/record/192642 Text 2013 ftinfoscience https://doi.org/10.21437/Interspeech.2012-573 2023-02-13T22:17:51Z Recently, the use of phoneme class-conditional probabilities as features (posterior features) for template-based ASR has been proposed. These features have been found to generalize well to unseen data and yield better systems than standard spectral-based features. In this paper, motivated by the high quality of current text-to-speech systems and the robustness of posterior features toward undesired variability, we investigate the use of synthetic speech to generate reference templates. The use of synthetic speech in template-based ASR not only allows to address the issue of in-domain data collection but also expansion of vocabulary. Using 75- and 600-word task-independent and speaker-independent setup on Phonebook database, we investigate different synthetic voices produced by the Festival HTS-based synthesizer trained on CMU ARCTIC databases. Our study shows that synthetic speech templates can yield performance comparable to the natural speech templates, especially with synthetic voices that have high intelligibility. Text Arctic EPFL Infoscience (Ecole Polytechnique Fédérale Lausanne) Arctic Interspeech 2012 2146 2149
institution	Open Polar
collection	EPFL Infoscience (Ecole Polytechnique Fédérale Lausanne)
op_collection_id	ftinfoscience
language	unknown
description	Recently, the use of phoneme class-conditional probabilities as features (posterior features) for template-based ASR has been proposed. These features have been found to generalize well to unseen data and yield better systems than standard spectral-based features. In this paper, motivated by the high quality of current text-to-speech systems and the robustness of posterior features toward undesired variability, we investigate the use of synthetic speech to generate reference templates. The use of synthetic speech in template-based ASR not only allows to address the issue of in-domain data collection but also expansion of vocabulary. Using 75- and 600-word task-independent and speaker-independent setup on Phonebook database, we investigate different synthetic voices produced by the Festival HTS-based synthesizer trained on CMU ARCTIC databases. Our study shows that synthetic speech templates can yield performance comparable to the natural speech templates, especially with synthetic voices that have high intelligibility.
format	Text
author	Soldo, Serena Magimai.-Doss, Mathew Bourlard, Hervé
spellingShingle	Soldo, Serena Magimai.-Doss, Mathew Bourlard, Hervé Synthetic References for Template-based ASR using Posterior Features
author_facet	Soldo, Serena Magimai.-Doss, Mathew Bourlard, Hervé
author_sort	Soldo, Serena
title	Synthetic References for Template-based ASR using Posterior Features
title_short	Synthetic References for Template-based ASR using Posterior Features
title_full	Synthetic References for Template-based ASR using Posterior Features
title_fullStr	Synthetic References for Template-based ASR using Posterior Features
title_full_unstemmed	Synthetic References for Template-based ASR using Posterior Features
title_sort	synthetic references for template-based asr using posterior features
publishDate	2013
url	https://doi.org/10.21437/Interspeech.2012-573 https://infoscience.epfl.ch/record/192642/files/Soldo_INTERSPEECH_2012.pdf http://infoscience.epfl.ch/record/192642
geographic	Arctic
geographic_facet	Arctic
genre	Arctic
genre_facet	Arctic
op_source	http://infoscience.epfl.ch/record/192642
op_relation	doi:10.21437/Interspeech.2012-573 https://infoscience.epfl.ch/record/192642/files/Soldo_INTERSPEECH_2012.pdf http://infoscience.epfl.ch/record/192642
op_doi	https://doi.org/10.21437/Interspeech.2012-573
container_title	Interspeech 2012
container_start_page	2146
op_container_end_page	2149
_version_	1766335062461120512

Synthetic References for Template-based ASR using Posterior Features

Similar Items