Blizzard 2008: Experiments on Unit Size for Unit Selection Speech Synthesis
This paper describes the techniques and approaches developed at IIIT Hyderabad for building synthetic voices in Blizzard 2008 speech synthesis challenge. We have submitted three different voices: English full voice, English ARCTIC voice and Mandarin voice. Our system is identified as D. In building...
Main Authors: | , , , , |
---|---|
Other Authors: | |
Format: | Text |
Language: | English |
Subjects: | |
Online Access: | http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.145.3050 http://www.cs.cmu.edu/~awb/papers/bc2008/IIIT-H_D.pdf |
Summary: | This paper describes the techniques and approaches developed at IIIT Hyderabad for building synthetic voices in Blizzard 2008 speech synthesis challenge. We have submitted three different voices: English full voice, English ARCTIC voice and Mandarin voice. Our system is identified as D. In building the three voices, our approach has been to experiment and exploit syllable-like large units for concatenative synthesis. Inspite of large database supplied in Blizzard 2008, we find that a backoff strategy is essential in using syllable-like units. In this paper, we propose a novel technique of approximate matching of the syllables as back-off technique for building voices. Index Terms: speech synthesis, unit size, tonal unit, prominence 1. |
---|