Learning Stage-wise GANs for Whistle Extraction in Time-Frequency Spectrograms
Whistle contour extraction aims to derive animal whistles from time-frequency spectrograms as polylines. For toothed whales, whistle extraction results can serve as the basis for analyzing animal abundance, species identity, and social activities. During the last few decades, as long-term recording...
Published in: | IEEE Transactions on Multimedia |
---|---|
Main Authors: | , , , , , , , |
Format: | Text |
Language: | unknown |
Published: |
2023
|
Subjects: | |
Online Access: | http://arxiv.org/abs/2304.02714 https://doi.org/10.1109/TMM.2023.3251109 |
id |
ftarxivpreprints:oai:arXiv.org:2304.02714 |
---|---|
record_format |
openpolar |
spelling |
ftarxivpreprints:oai:arXiv.org:2304.02714 2023-09-05T13:23:44+02:00 Learning Stage-wise GANs for Whistle Extraction in Time-Frequency Spectrograms Li, Pu Roch, Marie Klinck, Holger Fleishman, Erica Gillespie, Douglas Nosal, Eva-Marie Shiu, Yu Liu, Xiaobai 2023-04-05 http://arxiv.org/abs/2304.02714 https://doi.org/10.1109/TMM.2023.3251109 unknown http://arxiv.org/abs/2304.02714 doi:10.1109/TMM.2023.3251109 Computer Science - Computer Vision and Pattern Recognition Electrical Engineering and Systems Science - Signal Processing text 2023 ftarxivpreprints https://doi.org/10.1109/TMM.2023.3251109 2023-08-16T17:37:53Z Whistle contour extraction aims to derive animal whistles from time-frequency spectrograms as polylines. For toothed whales, whistle extraction results can serve as the basis for analyzing animal abundance, species identity, and social activities. During the last few decades, as long-term recording systems have become affordable, automated whistle extraction algorithms were proposed to process large volumes of recording data. Recently, a deep learning-based method demonstrated superior performance in extracting whistles under varying noise conditions. However, training such networks requires a large amount of labor-intensive annotation, which is not available for many species. To overcome this limitation, we present a framework of stage-wise generative adversarial networks (GANs), which compile new whistle data suitable for deep model training via three stages: generation of background noise in the spectrogram, generation of whistle contours, and generation of whistle signals. By separating the generation of different components in the samples, our framework composes visually promising whistle data and labels even when few expert annotated data are available. Regardless of the amount of human-annotated data, the proposed data augmentation framework leads to a consistent improvement in performance of the whistle extraction model, with a maximum increase of 1.69 in the whistle extraction mean F1-score. Our stage-wise GAN also surpasses one single GAN in improving whistle extraction models with augmented data. The data and code will be available at https://github.com/Paul-LiPu/CompositeGAN\_WhistleAugment. Comment: Accepted by IEEE Transactions of Multimedia (2023) Text toothed whales ArXiv.org (Cornell University Library) IEEE Transactions on Multimedia 1 13 |
institution |
Open Polar |
collection |
ArXiv.org (Cornell University Library) |
op_collection_id |
ftarxivpreprints |
language |
unknown |
topic |
Computer Science - Computer Vision and Pattern Recognition Electrical Engineering and Systems Science - Signal Processing |
spellingShingle |
Computer Science - Computer Vision and Pattern Recognition Electrical Engineering and Systems Science - Signal Processing Li, Pu Roch, Marie Klinck, Holger Fleishman, Erica Gillespie, Douglas Nosal, Eva-Marie Shiu, Yu Liu, Xiaobai Learning Stage-wise GANs for Whistle Extraction in Time-Frequency Spectrograms |
topic_facet |
Computer Science - Computer Vision and Pattern Recognition Electrical Engineering and Systems Science - Signal Processing |
description |
Whistle contour extraction aims to derive animal whistles from time-frequency spectrograms as polylines. For toothed whales, whistle extraction results can serve as the basis for analyzing animal abundance, species identity, and social activities. During the last few decades, as long-term recording systems have become affordable, automated whistle extraction algorithms were proposed to process large volumes of recording data. Recently, a deep learning-based method demonstrated superior performance in extracting whistles under varying noise conditions. However, training such networks requires a large amount of labor-intensive annotation, which is not available for many species. To overcome this limitation, we present a framework of stage-wise generative adversarial networks (GANs), which compile new whistle data suitable for deep model training via three stages: generation of background noise in the spectrogram, generation of whistle contours, and generation of whistle signals. By separating the generation of different components in the samples, our framework composes visually promising whistle data and labels even when few expert annotated data are available. Regardless of the amount of human-annotated data, the proposed data augmentation framework leads to a consistent improvement in performance of the whistle extraction model, with a maximum increase of 1.69 in the whistle extraction mean F1-score. Our stage-wise GAN also surpasses one single GAN in improving whistle extraction models with augmented data. The data and code will be available at https://github.com/Paul-LiPu/CompositeGAN\_WhistleAugment. Comment: Accepted by IEEE Transactions of Multimedia (2023) |
format |
Text |
author |
Li, Pu Roch, Marie Klinck, Holger Fleishman, Erica Gillespie, Douglas Nosal, Eva-Marie Shiu, Yu Liu, Xiaobai |
author_facet |
Li, Pu Roch, Marie Klinck, Holger Fleishman, Erica Gillespie, Douglas Nosal, Eva-Marie Shiu, Yu Liu, Xiaobai |
author_sort |
Li, Pu |
title |
Learning Stage-wise GANs for Whistle Extraction in Time-Frequency Spectrograms |
title_short |
Learning Stage-wise GANs for Whistle Extraction in Time-Frequency Spectrograms |
title_full |
Learning Stage-wise GANs for Whistle Extraction in Time-Frequency Spectrograms |
title_fullStr |
Learning Stage-wise GANs for Whistle Extraction in Time-Frequency Spectrograms |
title_full_unstemmed |
Learning Stage-wise GANs for Whistle Extraction in Time-Frequency Spectrograms |
title_sort |
learning stage-wise gans for whistle extraction in time-frequency spectrograms |
publishDate |
2023 |
url |
http://arxiv.org/abs/2304.02714 https://doi.org/10.1109/TMM.2023.3251109 |
genre |
toothed whales |
genre_facet |
toothed whales |
op_relation |
http://arxiv.org/abs/2304.02714 doi:10.1109/TMM.2023.3251109 |
op_doi |
https://doi.org/10.1109/TMM.2023.3251109 |
container_title |
IEEE Transactions on Multimedia |
container_start_page |
1 |
op_container_end_page |
13 |
_version_ |
1776204330670489600 |