A Replication Dataset for Fundamental Frequency Estimation
Part of the dissertation Pitch of Voiced Speech in the Short-Time Fourier Transform: Algorithms, Ground Truths, and Evaluation Methods. © 2020, Bastian Bechtold. All rights reserved. Estimating the fundamental frequency of speech remains an active area of research, with varied applications in speech...
Main Author: | |
---|---|
Format: | Dataset |
Language: | English |
Published: |
2020
|
Subjects: | |
Online Access: | https://zenodo.org/record/3904389 https://doi.org/10.5281/zenodo.3904389 |
id |
ftzenodo:oai:zenodo.org:3904389 |
---|---|
record_format |
openpolar |
spelling |
ftzenodo:oai:zenodo.org:3904389 2023-05-15T15:14:18+02:00 A Replication Dataset for Fundamental Frequency Estimation Bechtold, Bastian 2020-06-26 https://zenodo.org/record/3904389 https://doi.org/10.5281/zenodo.3904389 eng eng doi:10.5281/zenodo.3904388 https://zenodo.org/record/3904389 https://doi.org/10.5281/zenodo.3904389 oai:zenodo.org:3904389 info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by/4.0/legalcode signal processing audio speech pitch fundamental frequency info:eu-repo/semantics/other dataset 2020 ftzenodo https://doi.org/10.5281/zenodo.390438910.5281/zenodo.3904388 2023-03-11T03:31:00Z Part of the dissertation Pitch of Voiced Speech in the Short-Time Fourier Transform: Algorithms, Ground Truths, and Evaluation Methods. © 2020, Bastian Bechtold. All rights reserved. Estimating the fundamental frequency of speech remains an active area of research, with varied applications in speech recognition, speaker identification, and speech compression. A vast number of algorithms for estimatimating this quantity have been proposed over the years, and a number of speech and noise corpora have been developed for evaluating their performance. The present dataset contains estimated fundamental frequency tracks of 25 algorithms, six speech corpora, two noise corpora, at nine signal-to-noise ratios between -20 and 20 dB SNR, as well as an additional evaluation of synthetic harmonic tone complexes in white noise. The dataset also contains pre-calculated performance measures both novel and traditional, in reference to each speech corpus’ ground truth, the algorithms’ own clean-speech estimate, and our own consensus truth. It can thus serve as the basis for a comparison study, or to replicate existing studies from a larger dataset, or as a reference for developing new fundamental frequency estimation algorithms. All source code and data is available to download, and entirely reproducible, albeit requiring about one year of processor-time. Included Code and Data ground truth data.zip is a JBOF dataset of fundamental frequency estimates and ground truths of all speech files in the following corpora: CMU-ARCTIC (consensus truth) [1] FDA (corpus truth and consensus truth) [2] KEELE (corpus truth and consensus truth) [3] MOCHA-TIMIT (consensus truth) [4] PTDB-TUG (corpus truth and consensus truth) [5] TIMIT (consensus truth) [6] noisy speech data.zip is a JBOF datasets of fundamental frequency estimates of speech files mixed with noise from the following corpora: NOISEX [7] QUT-NOISE [8] synthetic speech data.zip is a JBOF dataset of fundamental frequency estimates of synthetic harmonic tone complexes in white noise. ... Dataset Arctic Zenodo Arctic |
institution |
Open Polar |
collection |
Zenodo |
op_collection_id |
ftzenodo |
language |
English |
topic |
signal processing audio speech pitch fundamental frequency |
spellingShingle |
signal processing audio speech pitch fundamental frequency Bechtold, Bastian A Replication Dataset for Fundamental Frequency Estimation |
topic_facet |
signal processing audio speech pitch fundamental frequency |
description |
Part of the dissertation Pitch of Voiced Speech in the Short-Time Fourier Transform: Algorithms, Ground Truths, and Evaluation Methods. © 2020, Bastian Bechtold. All rights reserved. Estimating the fundamental frequency of speech remains an active area of research, with varied applications in speech recognition, speaker identification, and speech compression. A vast number of algorithms for estimatimating this quantity have been proposed over the years, and a number of speech and noise corpora have been developed for evaluating their performance. The present dataset contains estimated fundamental frequency tracks of 25 algorithms, six speech corpora, two noise corpora, at nine signal-to-noise ratios between -20 and 20 dB SNR, as well as an additional evaluation of synthetic harmonic tone complexes in white noise. The dataset also contains pre-calculated performance measures both novel and traditional, in reference to each speech corpus’ ground truth, the algorithms’ own clean-speech estimate, and our own consensus truth. It can thus serve as the basis for a comparison study, or to replicate existing studies from a larger dataset, or as a reference for developing new fundamental frequency estimation algorithms. All source code and data is available to download, and entirely reproducible, albeit requiring about one year of processor-time. Included Code and Data ground truth data.zip is a JBOF dataset of fundamental frequency estimates and ground truths of all speech files in the following corpora: CMU-ARCTIC (consensus truth) [1] FDA (corpus truth and consensus truth) [2] KEELE (corpus truth and consensus truth) [3] MOCHA-TIMIT (consensus truth) [4] PTDB-TUG (corpus truth and consensus truth) [5] TIMIT (consensus truth) [6] noisy speech data.zip is a JBOF datasets of fundamental frequency estimates of speech files mixed with noise from the following corpora: NOISEX [7] QUT-NOISE [8] synthetic speech data.zip is a JBOF dataset of fundamental frequency estimates of synthetic harmonic tone complexes in white noise. ... |
format |
Dataset |
author |
Bechtold, Bastian |
author_facet |
Bechtold, Bastian |
author_sort |
Bechtold, Bastian |
title |
A Replication Dataset for Fundamental Frequency Estimation |
title_short |
A Replication Dataset for Fundamental Frequency Estimation |
title_full |
A Replication Dataset for Fundamental Frequency Estimation |
title_fullStr |
A Replication Dataset for Fundamental Frequency Estimation |
title_full_unstemmed |
A Replication Dataset for Fundamental Frequency Estimation |
title_sort |
replication dataset for fundamental frequency estimation |
publishDate |
2020 |
url |
https://zenodo.org/record/3904389 https://doi.org/10.5281/zenodo.3904389 |
geographic |
Arctic |
geographic_facet |
Arctic |
genre |
Arctic |
genre_facet |
Arctic |
op_relation |
doi:10.5281/zenodo.3904388 https://zenodo.org/record/3904389 https://doi.org/10.5281/zenodo.3904389 oai:zenodo.org:3904389 |
op_rights |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by/4.0/legalcode |
op_doi |
https://doi.org/10.5281/zenodo.390438910.5281/zenodo.3904388 |
_version_ |
1766344769204649984 |