Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer

This paper describes a unified Deep Metric Learning (DML) framework to predict the target cost directly by supervised learning method. The conventional methods to calculate the target cost include two separate steps: feature extraction and standard distance measurement . The proposed DML framework a...

Full description

Bibliographic Details
Main Authors:	Fu, Ruibo, Tao, Jianhua, Zheng, Yibin, Wen, Zhengqi
Format:	Other/Unknown Material
Language:	English
Published:	2018
Subjects:	speech synthesis unit-selection target cost deep metric learning DML
Online Access:	http://ir.ia.ac.cn/handle/173211/39597

id	ftchiacadsccasia:oai:ir.ia.ac.cn:173211/39597
record_format	openpolar
spelling	ftchiacadsccasia:oai:ir.ia.ac.cn:173211/39597 2024-06-23T07:52:22+00:00 Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer Fu, Ruibo Tao, Jianhua Zheng, Yibin Wen, Zhengqi 2018-09 http://ir.ia.ac.cn/handle/173211/39597 英语 eng http://ir.ia.ac.cn/handle/173211/39597 speech synthesis unit-selection target cost deep metric learning 会议论文 2018 ftchiacadsccasia 2024-06-04T00:01:37Z This paper describes a unified Deep Metric Learning (DML) framework to predict the target cost directly by supervised learning method. The conventional methods to calculate the target cost include two separate steps: feature extraction and standard distance measurement . The proposed DML framework aims to measure the similarity between the candidate units and the target units more reasonably and directly. Firstly, the symmetrical DML framework is pre-trained to learn the metric between pairs of candidate units and the target units. The relabeling procedure is added to correct the initial designed label of the target cost. Secondly, the acoustic features of the target units is removed, which fits the runtime of the unit-selection synthesizer. T he a symmetrical DML is fine-tuned to learn the metric between candidate units and target units. Compared to the conventional methods, the proposed unified DML framework can avoid the accumulation of errors in separate steps and improve the accuracy in labeling and predicting the target cost. The evaluation results demonstrate that the naturalness of synthetic speech has been improved by adopting DML framework to predict target cost. Other/Unknown Material DML Institute of Automation: CASIA OpenIR (Chinese Academy of Sciences)
institution	Open Polar
collection	Institute of Automation: CASIA OpenIR (Chinese Academy of Sciences)
op_collection_id	ftchiacadsccasia
language	English
topic	speech synthesis unit-selection target cost deep metric learning
spellingShingle	speech synthesis unit-selection target cost deep metric learning Fu, Ruibo Tao, Jianhua Zheng, Yibin Wen, Zhengqi Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer
topic_facet	speech synthesis unit-selection target cost deep metric learning
description	This paper describes a unified Deep Metric Learning (DML) framework to predict the target cost directly by supervised learning method. The conventional methods to calculate the target cost include two separate steps: feature extraction and standard distance measurement . The proposed DML framework aims to measure the similarity between the candidate units and the target units more reasonably and directly. Firstly, the symmetrical DML framework is pre-trained to learn the metric between pairs of candidate units and the target units. The relabeling procedure is added to correct the initial designed label of the target cost. Secondly, the acoustic features of the target units is removed, which fits the runtime of the unit-selection synthesizer. T he a symmetrical DML is fine-tuned to learn the metric between candidate units and target units. Compared to the conventional methods, the proposed unified DML framework can avoid the accumulation of errors in separate steps and improve the accuracy in labeling and predicting the target cost. The evaluation results demonstrate that the naturalness of synthetic speech has been improved by adopting DML framework to predict target cost.
format	Other/Unknown Material
author	Fu, Ruibo Tao, Jianhua Zheng, Yibin Wen, Zhengqi
author_facet	Fu, Ruibo Tao, Jianhua Zheng, Yibin Wen, Zhengqi
author_sort	Fu, Ruibo
title	Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer
title_short	Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer
title_full	Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer
title_fullStr	Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer
title_full_unstemmed	Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer
title_sort	deep metric learning for the target cost in unit-selection speech synthesizer
publishDate	2018
url	http://ir.ia.ac.cn/handle/173211/39597
genre	DML
genre_facet	DML
op_relation	http://ir.ia.ac.cn/handle/173211/39597
_version_	1802643653080907776

Deep Metric Learning for the Target Cost in Unit-Selection Speech Synthesizer

Similar Items