Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration

This work aims to build a multilingual text-to-speech (TTS) synthesis system for ten lower-resourced Turkic languages: Azerbaijani, Bashkir, Kazakh, Kyrgyz, Sakha, Tatar, Turkish, Turkmen, Uyghur, and Uzbek. We specifically target the zero-shot learning scenario, where a TTS model trained using the...

Full description

Bibliographic Details
Main Authors: Yeshpanov, Rustem, Mussakhojayeva, Saida, Khassanov, Yerbolat
Format: Text
Language:unknown
Published: 2023
Subjects:
Online Access:http://arxiv.org/abs/2305.15749
id ftarxivpreprints:oai:arXiv.org:2305.15749
record_format openpolar
spelling ftarxivpreprints:oai:arXiv.org:2305.15749 2023-09-05T13:22:51+02:00 Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration Yeshpanov, Rustem Mussakhojayeva, Saida Khassanov, Yerbolat 2023-05-25 http://arxiv.org/abs/2305.15749 unknown http://arxiv.org/abs/2305.15749 Electrical Engineering and Systems Science - Audio and Speech Processing Computer Science - Computation and Language text 2023 ftarxivpreprints 2023-08-16T17:43:32Z This work aims to build a multilingual text-to-speech (TTS) synthesis system for ten lower-resourced Turkic languages: Azerbaijani, Bashkir, Kazakh, Kyrgyz, Sakha, Tatar, Turkish, Turkmen, Uyghur, and Uzbek. We specifically target the zero-shot learning scenario, where a TTS model trained using the data of one language is applied to synthesise speech for other, unseen languages. An end-to-end TTS system based on the Tacotron 2 architecture was trained using only the available data of the Kazakh language. To generate speech for the other Turkic languages, we first mapped the letters of the Turkic alphabets onto the symbols of the International Phonetic Alphabet (IPA), which were then converted to the Kazakh alphabet letters. To demonstrate the feasibility of the proposed approach, we evaluated the multilingual Turkic TTS model subjectively and obtained promising results. To enable replication of the experiments, we make our code and dataset publicly available in our GitHub repository. Comment: 5 pages, 1 figure, 3 tables, accepted to Interspeech Text Sakha ArXiv.org (Cornell University Library) Sakha
institution Open Polar
collection ArXiv.org (Cornell University Library)
op_collection_id ftarxivpreprints
language unknown
topic Electrical Engineering and Systems Science - Audio and Speech Processing
Computer Science - Computation and Language
spellingShingle Electrical Engineering and Systems Science - Audio and Speech Processing
Computer Science - Computation and Language
Yeshpanov, Rustem
Mussakhojayeva, Saida
Khassanov, Yerbolat
Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration
topic_facet Electrical Engineering and Systems Science - Audio and Speech Processing
Computer Science - Computation and Language
description This work aims to build a multilingual text-to-speech (TTS) synthesis system for ten lower-resourced Turkic languages: Azerbaijani, Bashkir, Kazakh, Kyrgyz, Sakha, Tatar, Turkish, Turkmen, Uyghur, and Uzbek. We specifically target the zero-shot learning scenario, where a TTS model trained using the data of one language is applied to synthesise speech for other, unseen languages. An end-to-end TTS system based on the Tacotron 2 architecture was trained using only the available data of the Kazakh language. To generate speech for the other Turkic languages, we first mapped the letters of the Turkic alphabets onto the symbols of the International Phonetic Alphabet (IPA), which were then converted to the Kazakh alphabet letters. To demonstrate the feasibility of the proposed approach, we evaluated the multilingual Turkic TTS model subjectively and obtained promising results. To enable replication of the experiments, we make our code and dataset publicly available in our GitHub repository. Comment: 5 pages, 1 figure, 3 tables, accepted to Interspeech
format Text
author Yeshpanov, Rustem
Mussakhojayeva, Saida
Khassanov, Yerbolat
author_facet Yeshpanov, Rustem
Mussakhojayeva, Saida
Khassanov, Yerbolat
author_sort Yeshpanov, Rustem
title Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration
title_short Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration
title_full Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration
title_fullStr Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration
title_full_unstemmed Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration
title_sort multilingual text-to-speech synthesis for turkic languages using transliteration
publishDate 2023
url http://arxiv.org/abs/2305.15749
geographic Sakha
geographic_facet Sakha
genre Sakha
genre_facet Sakha
op_relation http://arxiv.org/abs/2305.15749
_version_ 1776203412085407744