A machine learning pipeline for classification of cetacean echolocation clicks in large underwater acoustic datasets.

Machine learning algorithms, including recent advances in deep learning, are promising for tools for detection and classification of broadband high frequency signals in passive acoustic recordings. However, these methods are generally data-hungry and progress has been limited by challenges related t...

Full description

Bibliographic Details
Published in:PLOS Computational Biology
Main Author: Kaitlin E Frasier
Format: Article in Journal/Newspaper
Language:English
Published: Public Library of Science (PLoS) 2021
Subjects:
Online Access:https://doi.org/10.1371/journal.pcbi.1009613
https://doaj.org/article/6e65d5a3c6304928a57495375a8dd080
id ftdoajarticles:oai:doaj.org/article:6e65d5a3c6304928a57495375a8dd080
record_format openpolar
spelling ftdoajarticles:oai:doaj.org/article:6e65d5a3c6304928a57495375a8dd080 2023-05-15T18:33:25+02:00 A machine learning pipeline for classification of cetacean echolocation clicks in large underwater acoustic datasets. Kaitlin E Frasier 2021-12-01T00:00:00Z https://doi.org/10.1371/journal.pcbi.1009613 https://doaj.org/article/6e65d5a3c6304928a57495375a8dd080 EN eng Public Library of Science (PLoS) https://doi.org/10.1371/journal.pcbi.1009613 https://doaj.org/toc/1553-734X https://doaj.org/toc/1553-7358 1553-734X 1553-7358 doi:10.1371/journal.pcbi.1009613 https://doaj.org/article/6e65d5a3c6304928a57495375a8dd080 PLoS Computational Biology, Vol 17, Iss 12, p e1009613 (2021) Biology (General) QH301-705.5 article 2021 ftdoajarticles https://doi.org/10.1371/journal.pcbi.1009613 2022-12-31T16:32:54Z Machine learning algorithms, including recent advances in deep learning, are promising for tools for detection and classification of broadband high frequency signals in passive acoustic recordings. However, these methods are generally data-hungry and progress has been limited by challenges related to the lack of labeled datasets adequate for training and testing. Large quantities of known and as yet unidentified broadband signal types mingle in marine recordings, with variability introduced by acoustic propagation, source depths and orientations, and interacting signals. Manual classification of these datasets is unmanageable without an in-depth knowledge of the acoustic context of each recording location. A signal classification pipeline is presented which combines unsupervised and supervised learning phases with opportunities for expert oversight to label signals of interest. The method is illustrated with a case study using unsupervised clustering to identify five toothed whale echolocation click types and two anthropogenic signal categories. These categories are used to train a deep network to classify detected signals in either averaged time bins or as individual detections, in two independent datasets. Bin-level classification achieved higher overall precision (>99%) than click-level classification. However, click-level classification had the advantage of providing a label for every signal, and achieved higher overall recall, with overall precision from 92 to 94%. The results suggest that unsupervised learning is a viable solution for efficiently generating the large, representative training sets needed for applications of deep learning in passive acoustics. Article in Journal/Newspaper toothed whale Directory of Open Access Journals: DOAJ Articles PLOS Computational Biology 17 12 e1009613
institution Open Polar
collection Directory of Open Access Journals: DOAJ Articles
op_collection_id ftdoajarticles
language English
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Kaitlin E Frasier
A machine learning pipeline for classification of cetacean echolocation clicks in large underwater acoustic datasets.
topic_facet Biology (General)
QH301-705.5
description Machine learning algorithms, including recent advances in deep learning, are promising for tools for detection and classification of broadband high frequency signals in passive acoustic recordings. However, these methods are generally data-hungry and progress has been limited by challenges related to the lack of labeled datasets adequate for training and testing. Large quantities of known and as yet unidentified broadband signal types mingle in marine recordings, with variability introduced by acoustic propagation, source depths and orientations, and interacting signals. Manual classification of these datasets is unmanageable without an in-depth knowledge of the acoustic context of each recording location. A signal classification pipeline is presented which combines unsupervised and supervised learning phases with opportunities for expert oversight to label signals of interest. The method is illustrated with a case study using unsupervised clustering to identify five toothed whale echolocation click types and two anthropogenic signal categories. These categories are used to train a deep network to classify detected signals in either averaged time bins or as individual detections, in two independent datasets. Bin-level classification achieved higher overall precision (>99%) than click-level classification. However, click-level classification had the advantage of providing a label for every signal, and achieved higher overall recall, with overall precision from 92 to 94%. The results suggest that unsupervised learning is a viable solution for efficiently generating the large, representative training sets needed for applications of deep learning in passive acoustics.
format Article in Journal/Newspaper
author Kaitlin E Frasier
author_facet Kaitlin E Frasier
author_sort Kaitlin E Frasier
title A machine learning pipeline for classification of cetacean echolocation clicks in large underwater acoustic datasets.
title_short A machine learning pipeline for classification of cetacean echolocation clicks in large underwater acoustic datasets.
title_full A machine learning pipeline for classification of cetacean echolocation clicks in large underwater acoustic datasets.
title_fullStr A machine learning pipeline for classification of cetacean echolocation clicks in large underwater acoustic datasets.
title_full_unstemmed A machine learning pipeline for classification of cetacean echolocation clicks in large underwater acoustic datasets.
title_sort machine learning pipeline for classification of cetacean echolocation clicks in large underwater acoustic datasets.
publisher Public Library of Science (PLoS)
publishDate 2021
url https://doi.org/10.1371/journal.pcbi.1009613
https://doaj.org/article/6e65d5a3c6304928a57495375a8dd080
genre toothed whale
genre_facet toothed whale
op_source PLoS Computational Biology, Vol 17, Iss 12, p e1009613 (2021)
op_relation https://doi.org/10.1371/journal.pcbi.1009613
https://doaj.org/toc/1553-734X
https://doaj.org/toc/1553-7358
1553-734X
1553-7358
doi:10.1371/journal.pcbi.1009613
https://doaj.org/article/6e65d5a3c6304928a57495375a8dd080
op_doi https://doi.org/10.1371/journal.pcbi.1009613
container_title PLOS Computational Biology
container_volume 17
container_issue 12
container_start_page e1009613
_version_ 1766218029780172800