A Probabilistic Approach to Unit Selection for Corpus-based Speech Synthesis
In this paper, we present a novel statistical approach to corpus-based speech synthesis. Unit selection is directed by probabilistic models for F0 contour, duration, and spectral characteristics of the synthesis units. The F0 targets for units are modeled by statistical additive models, and duration...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Text |
Language: | English |
Subjects: | |
Online Access: | http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.214.8712 http://www.festvox.org/blizzard/bc2005/IS052272.PDF |
id |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.214.8712 |
---|---|
record_format |
openpolar |
spelling |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.214.8712 2023-05-15T15:00:07+02:00 A Probabilistic Approach to Unit Selection for Corpus-based Speech Synthesis Shinsuke Sakai Han Shu The Pennsylvania State University CiteSeerX Archives http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.214.8712 http://www.festvox.org/blizzard/bc2005/IS052272.PDF en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.214.8712 http://www.festvox.org/blizzard/bc2005/IS052272.PDF Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://www.festvox.org/blizzard/bc2005/IS052272.PDF text ftciteseerx 2016-01-07T17:59:55Z In this paper, we present a novel statistical approach to corpus-based speech synthesis. Unit selection is directed by probabilistic models for F0 contour, duration, and spectral characteristics of the synthesis units. The F0 targets for units are modeled by statistical additive models, and duration targets are modeled by regression trees. Spectral targets for a unit is modeled by Gaussian mixtures on MFCC-based features. Goodness of concatenation of two units is modeled by conditional Gaussian models on MFCC-based features. Although the system is in its early stage of development, we implemented an English speech synthesizer with CMU Arctic corpora and confirmed the effectiveness of this new framework. 1. Text Arctic Unknown Arctic |
institution |
Open Polar |
collection |
Unknown |
op_collection_id |
ftciteseerx |
language |
English |
description |
In this paper, we present a novel statistical approach to corpus-based speech synthesis. Unit selection is directed by probabilistic models for F0 contour, duration, and spectral characteristics of the synthesis units. The F0 targets for units are modeled by statistical additive models, and duration targets are modeled by regression trees. Spectral targets for a unit is modeled by Gaussian mixtures on MFCC-based features. Goodness of concatenation of two units is modeled by conditional Gaussian models on MFCC-based features. Although the system is in its early stage of development, we implemented an English speech synthesizer with CMU Arctic corpora and confirmed the effectiveness of this new framework. 1. |
author2 |
The Pennsylvania State University CiteSeerX Archives |
format |
Text |
author |
Shinsuke Sakai Han Shu |
spellingShingle |
Shinsuke Sakai Han Shu A Probabilistic Approach to Unit Selection for Corpus-based Speech Synthesis |
author_facet |
Shinsuke Sakai Han Shu |
author_sort |
Shinsuke Sakai |
title |
A Probabilistic Approach to Unit Selection for Corpus-based Speech Synthesis |
title_short |
A Probabilistic Approach to Unit Selection for Corpus-based Speech Synthesis |
title_full |
A Probabilistic Approach to Unit Selection for Corpus-based Speech Synthesis |
title_fullStr |
A Probabilistic Approach to Unit Selection for Corpus-based Speech Synthesis |
title_full_unstemmed |
A Probabilistic Approach to Unit Selection for Corpus-based Speech Synthesis |
title_sort |
probabilistic approach to unit selection for corpus-based speech synthesis |
url |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.214.8712 http://www.festvox.org/blizzard/bc2005/IS052272.PDF |
geographic |
Arctic |
geographic_facet |
Arctic |
genre |
Arctic |
genre_facet |
Arctic |
op_source |
http://www.festvox.org/blizzard/bc2005/IS052272.PDF |
op_relation |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.214.8712 http://www.festvox.org/blizzard/bc2005/IS052272.PDF |
op_rights |
Metadata may be used without restrictions as long as the oai identifier remains attached to it. |
_version_ |
1766332231457964032 |