CMU ARCTIC Concatenated 15s

CMU ARCTIC Concat15 This dataset contains 140 speech samples formed by concatening utterances from the CMU ARCTIC speech corpus [1]. The dataset is male/female balanced and contains 7 of each. There are in addition 10 samples for each speaker. A single sample was formed by concatenating samples from...

Full description

Bibliographic Details
Main Author: Robin Scheibler
Format: Moving Image (Video)
Language:English
Published: 2019
Subjects:
Online Access:https://zenodo.org/record/3066489
https://doi.org/10.5281/zenodo.3066489
Description
Summary:CMU ARCTIC Concat15 This dataset contains 140 speech samples formed by concatening utterances from the CMU ARCTIC speech corpus [1]. The dataset is male/female balanced and contains 7 of each. There are in addition 10 samples for each speaker. A single sample was formed by concatenating samples from the CMU ARCTIC corpus until the length exceeds 15 seconds. The following speakers were selected female: axb, clb, eey, ljm, lnh, slp, slt, male: aew, ahw, aup, awb, bdl, fem, gka. Use the dataset A JSON file containing the metadata is provided. The file is structured as follows. { "fs": <sampling rate>, "files": [ "first_file.wav", ], "sorted": { "male": { "aew": [ "cmu_arctic_male_aew_1.wav", <rest of this speaker's files> ], <rest of the male speakers> }, "female": { } } } There are two functions provided to help selecting the files, sampling and wav_read_center. The first will select a number of distinct subset of speakers. sampling(num_subsets, num_speakers, metadata_file, gender_balanced=False, seed=None): This function will pick automatically and at random subsets speech samples from a list generated using this file. Parameters --- num_subsets: int Number of subsets to create num_speakers: int Number of distinct speakers desired in a subset metadata_file: str Location of the metadata file gender_balanced: bool, optional If True, the subsets will have a the same number of male/female speakers when `num_speakers` is even, and one extra male, when `num_speakers` is odd. Default is `False`. seed: int, optional When a seed is provided, the random number generator is fixed to a deterministic state. This is useful for getting consistently the same set of speakers. The initial state of the random number generator is restored at the end of the function. When not provided, the random number generator is used without setting the seed. Returns --- A list of `num_subsets` lists of wav filenames, each containing `num_speakers` entries. The second reads a bunch of wav files, adjust their length so that ...