Speech and Noise Corpora for Pitch Estimation of Human Speech

Part of the dissertation . © 2020, Bastian Bechtold. All rights reserved. This dataset contains common speech and noise corpora for evaluating fundamental frequency estimation algorithms as convenient JBOF dataframes. Each corpus is available freely on its own, and allows redistribution: CMU-ARCTIC...

Full description

Bibliographic Details
Main Author: Bastian Bechtold
Other Authors: van de Par, Steven, Bitzer, Joerg
Format: Other/Unknown Material
Language:English
Published: Zenodo 2020
Subjects:
Online Access:https://doi.org/10.5281/zenodo.3921794
id ftzenodo:oai:zenodo.org:3921794
record_format openpolar
spelling ftzenodo:oai:zenodo.org:3921794 2024-09-09T19:24:28+00:00 Speech and Noise Corpora for Pitch Estimation of Human Speech Bastian Bechtold van de Par, Steven Bitzer, Joerg 2020-06-29 https://doi.org/10.5281/zenodo.3921794 eng eng Zenodo https://doi.org/10.5281/zenodo.3920590 https://doi.org/10.5281/zenodo.3921794 oai:zenodo.org:3921794 info:eu-repo/semantics/openAccess Other (Non-Commercial) speech noise fundamental frequency estimation info:eu-repo/semantics/other 2020 ftzenodo https://doi.org/10.5281/zenodo.392179410.5281/zenodo.3920590 2024-07-26T14:21:26Z Part of the dissertation . © 2020, Bastian Bechtold. All rights reserved. This dataset contains common speech and noise corpora for evaluating fundamental frequency estimation algorithms as convenient JBOF dataframes. Each corpus is available freely on its own, and allows redistribution: CMU-ARCTIC ( BSD license) [1] FDA ( free to download) [2] KEELE ( free for noncommercial use ) [3] MOCHA-TIMIT ( free for noncommercial use ) [4] PTDB-TUG ( ODBL license ) [5] NOISEX ( free to download ) [7] QUT-NOISE ( CC-BY-SA license ) [8] Additionally, this dataset contains PDAs-0.0.1-py3-none-any.whl , a Python≥ 3.6 module for Linux, containing several well-known fundamental frequency estimation algorithms: AUTOC [9] AMDF [10] BANA [11] CEP [12] CREPE [13] DIO [14] DNN [15] KALDI [16] MAPS MBSC [17] NLS [18] PEFAC [19] PRAAT [20] RAPT [21] SACC [22] SAFE [23] SHR [24] SIFT [25] SRH [26] STRAIGHT [27] SWIPE [28] YAAPT [29] YIN [30] The algorithms are included in their native programming language (Matlab for BANA, DNN, MBSC, NLS, NLS2, PEFAC, RAPT, RNN, SACC, SHR, SRH, STRAIGHT, SWIPE, YAAPT, and YIN; C for KALDI, PRAAT, and SAFE; Python for AMDF, AUTOC, CEP, CREPE, MAPS, and SIFT), and adapted to a common Python interface. AMDF, AUTOC, CEP, and SIFT are our partial re-implementations as no original source code could be found. All algorithms have been released as open source software, and are covered by their respective licenses. All of these files are published as part of my dissertation, " Pitch of Voiced Speech in the Short-Time Fourier Transform: Algorithms, Ground Truths, and Evaluation Methods ", and in support of the Replication Dataset for Fundamental Frequency Estimation . References: John Kominek and Alan W Black. CMU ARCTIC database for speech synthesis, 2003. Paul C Bagshaw, Steven Hiller, and Mervyn A Jack. Enhanced Pitch Tracking and the Processing of F0 Contours for Computer Aided Intonation Teaching. In EUROSPEECH, 1993. F Plante, Georg F Meyer, and William A Ainsworth. A Pitch Extraction Reference Database. ... Other/Unknown Material Arctic Zenodo Arctic Mervyn ENVELOPE(65.307,65.307,-70.509,-70.509)
institution Open Polar
collection Zenodo
op_collection_id ftzenodo
language English
topic speech
noise
fundamental frequency estimation
spellingShingle speech
noise
fundamental frequency estimation
Bastian Bechtold
Speech and Noise Corpora for Pitch Estimation of Human Speech
topic_facet speech
noise
fundamental frequency estimation
description Part of the dissertation . © 2020, Bastian Bechtold. All rights reserved. This dataset contains common speech and noise corpora for evaluating fundamental frequency estimation algorithms as convenient JBOF dataframes. Each corpus is available freely on its own, and allows redistribution: CMU-ARCTIC ( BSD license) [1] FDA ( free to download) [2] KEELE ( free for noncommercial use ) [3] MOCHA-TIMIT ( free for noncommercial use ) [4] PTDB-TUG ( ODBL license ) [5] NOISEX ( free to download ) [7] QUT-NOISE ( CC-BY-SA license ) [8] Additionally, this dataset contains PDAs-0.0.1-py3-none-any.whl , a Python≥ 3.6 module for Linux, containing several well-known fundamental frequency estimation algorithms: AUTOC [9] AMDF [10] BANA [11] CEP [12] CREPE [13] DIO [14] DNN [15] KALDI [16] MAPS MBSC [17] NLS [18] PEFAC [19] PRAAT [20] RAPT [21] SACC [22] SAFE [23] SHR [24] SIFT [25] SRH [26] STRAIGHT [27] SWIPE [28] YAAPT [29] YIN [30] The algorithms are included in their native programming language (Matlab for BANA, DNN, MBSC, NLS, NLS2, PEFAC, RAPT, RNN, SACC, SHR, SRH, STRAIGHT, SWIPE, YAAPT, and YIN; C for KALDI, PRAAT, and SAFE; Python for AMDF, AUTOC, CEP, CREPE, MAPS, and SIFT), and adapted to a common Python interface. AMDF, AUTOC, CEP, and SIFT are our partial re-implementations as no original source code could be found. All algorithms have been released as open source software, and are covered by their respective licenses. All of these files are published as part of my dissertation, " Pitch of Voiced Speech in the Short-Time Fourier Transform: Algorithms, Ground Truths, and Evaluation Methods ", and in support of the Replication Dataset for Fundamental Frequency Estimation . References: John Kominek and Alan W Black. CMU ARCTIC database for speech synthesis, 2003. Paul C Bagshaw, Steven Hiller, and Mervyn A Jack. Enhanced Pitch Tracking and the Processing of F0 Contours for Computer Aided Intonation Teaching. In EUROSPEECH, 1993. F Plante, Georg F Meyer, and William A Ainsworth. A Pitch Extraction Reference Database. ...
author2 van de Par, Steven
Bitzer, Joerg
format Other/Unknown Material
author Bastian Bechtold
author_facet Bastian Bechtold
author_sort Bastian Bechtold
title Speech and Noise Corpora for Pitch Estimation of Human Speech
title_short Speech and Noise Corpora for Pitch Estimation of Human Speech
title_full Speech and Noise Corpora for Pitch Estimation of Human Speech
title_fullStr Speech and Noise Corpora for Pitch Estimation of Human Speech
title_full_unstemmed Speech and Noise Corpora for Pitch Estimation of Human Speech
title_sort speech and noise corpora for pitch estimation of human speech
publisher Zenodo
publishDate 2020
url https://doi.org/10.5281/zenodo.3921794
long_lat ENVELOPE(65.307,65.307,-70.509,-70.509)
geographic Arctic
Mervyn
geographic_facet Arctic
Mervyn
genre Arctic
genre_facet Arctic
op_relation https://doi.org/10.5281/zenodo.3920590
https://doi.org/10.5281/zenodo.3921794
oai:zenodo.org:3921794
op_rights info:eu-repo/semantics/openAccess
Other (Non-Commercial)
op_doi https://doi.org/10.5281/zenodo.392179410.5281/zenodo.3920590
_version_ 1809894354837307392