A Replication Dataset for Fundamental Frequency Estimation

Part of the dissertation Pitch of Voiced Speech in the Short-Time Fourier Transform: Algorithms, Ground Truths, and Evaluation Methods. © 2020, Bastian Bechtold. All rights reserved. Estimating the fundamental frequency of speech remains an active area of research, with varied applications in speech...

Full description

Bibliographic Details
Main Author: Bechtold, Bastian
Format: Dataset
Language:English
Published: 2020
Subjects:
Online Access:https://zenodo.org/record/3904389
https://doi.org/10.5281/zenodo.3904389
id ftzenodo:oai:zenodo.org:3904389
record_format openpolar
spelling ftzenodo:oai:zenodo.org:3904389 2023-05-15T15:14:18+02:00 A Replication Dataset for Fundamental Frequency Estimation Bechtold, Bastian 2020-06-26 https://zenodo.org/record/3904389 https://doi.org/10.5281/zenodo.3904389 eng eng doi:10.5281/zenodo.3904388 https://zenodo.org/record/3904389 https://doi.org/10.5281/zenodo.3904389 oai:zenodo.org:3904389 info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by/4.0/legalcode signal processing audio speech pitch fundamental frequency info:eu-repo/semantics/other dataset 2020 ftzenodo https://doi.org/10.5281/zenodo.390438910.5281/zenodo.3904388 2023-03-11T03:31:00Z Part of the dissertation Pitch of Voiced Speech in the Short-Time Fourier Transform: Algorithms, Ground Truths, and Evaluation Methods. © 2020, Bastian Bechtold. All rights reserved. Estimating the fundamental frequency of speech remains an active area of research, with varied applications in speech recognition, speaker identification, and speech compression. A vast number of algorithms for estimatimating this quantity have been proposed over the years, and a number of speech and noise corpora have been developed for evaluating their performance. The present dataset contains estimated fundamental frequency tracks of 25 algorithms, six speech corpora, two noise corpora, at nine signal-to-noise ratios between -20 and 20 dB SNR, as well as an additional evaluation of synthetic harmonic tone complexes in white noise. The dataset also contains pre-calculated performance measures both novel and traditional, in reference to each speech corpus’ ground truth, the algorithms’ own clean-speech estimate, and our own consensus truth. It can thus serve as the basis for a comparison study, or to replicate existing studies from a larger dataset, or as a reference for developing new fundamental frequency estimation algorithms. All source code and data is available to download, and entirely reproducible, albeit requiring about one year of processor-time. Included Code and Data ground truth data.zip is a JBOF dataset of fundamental frequency estimates and ground truths of all speech files in the following corpora: CMU-ARCTIC (consensus truth) [1] FDA (corpus truth and consensus truth) [2] KEELE (corpus truth and consensus truth) [3] MOCHA-TIMIT (consensus truth) [4] PTDB-TUG (corpus truth and consensus truth) [5] TIMIT (consensus truth) [6] noisy speech data.zip is a JBOF datasets of fundamental frequency estimates of speech files mixed with noise from the following corpora: NOISEX [7] QUT-NOISE [8] synthetic speech data.zip is a JBOF dataset of fundamental frequency estimates of synthetic harmonic tone complexes in white noise. ... Dataset Arctic Zenodo Arctic
institution Open Polar
collection Zenodo
op_collection_id ftzenodo
language English
topic signal processing
audio
speech
pitch
fundamental frequency
spellingShingle signal processing
audio
speech
pitch
fundamental frequency
Bechtold, Bastian
A Replication Dataset for Fundamental Frequency Estimation
topic_facet signal processing
audio
speech
pitch
fundamental frequency
description Part of the dissertation Pitch of Voiced Speech in the Short-Time Fourier Transform: Algorithms, Ground Truths, and Evaluation Methods. © 2020, Bastian Bechtold. All rights reserved. Estimating the fundamental frequency of speech remains an active area of research, with varied applications in speech recognition, speaker identification, and speech compression. A vast number of algorithms for estimatimating this quantity have been proposed over the years, and a number of speech and noise corpora have been developed for evaluating their performance. The present dataset contains estimated fundamental frequency tracks of 25 algorithms, six speech corpora, two noise corpora, at nine signal-to-noise ratios between -20 and 20 dB SNR, as well as an additional evaluation of synthetic harmonic tone complexes in white noise. The dataset also contains pre-calculated performance measures both novel and traditional, in reference to each speech corpus’ ground truth, the algorithms’ own clean-speech estimate, and our own consensus truth. It can thus serve as the basis for a comparison study, or to replicate existing studies from a larger dataset, or as a reference for developing new fundamental frequency estimation algorithms. All source code and data is available to download, and entirely reproducible, albeit requiring about one year of processor-time. Included Code and Data ground truth data.zip is a JBOF dataset of fundamental frequency estimates and ground truths of all speech files in the following corpora: CMU-ARCTIC (consensus truth) [1] FDA (corpus truth and consensus truth) [2] KEELE (corpus truth and consensus truth) [3] MOCHA-TIMIT (consensus truth) [4] PTDB-TUG (corpus truth and consensus truth) [5] TIMIT (consensus truth) [6] noisy speech data.zip is a JBOF datasets of fundamental frequency estimates of speech files mixed with noise from the following corpora: NOISEX [7] QUT-NOISE [8] synthetic speech data.zip is a JBOF dataset of fundamental frequency estimates of synthetic harmonic tone complexes in white noise. ...
format Dataset
author Bechtold, Bastian
author_facet Bechtold, Bastian
author_sort Bechtold, Bastian
title A Replication Dataset for Fundamental Frequency Estimation
title_short A Replication Dataset for Fundamental Frequency Estimation
title_full A Replication Dataset for Fundamental Frequency Estimation
title_fullStr A Replication Dataset for Fundamental Frequency Estimation
title_full_unstemmed A Replication Dataset for Fundamental Frequency Estimation
title_sort replication dataset for fundamental frequency estimation
publishDate 2020
url https://zenodo.org/record/3904389
https://doi.org/10.5281/zenodo.3904389
geographic Arctic
geographic_facet Arctic
genre Arctic
genre_facet Arctic
op_relation doi:10.5281/zenodo.3904388
https://zenodo.org/record/3904389
https://doi.org/10.5281/zenodo.3904389
oai:zenodo.org:3904389
op_rights info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by/4.0/legalcode
op_doi https://doi.org/10.5281/zenodo.390438910.5281/zenodo.3904388
_version_ 1766344769204649984