Image_3_A Convolutional Neural Network for Automated Detection of Humpback Whale Song in a Diverse, Long-Term Passive Acoustic Dataset.JPEG

Passive acoustic monitoring is a well-established tool for researching the occurrence, movements, and ecology of a wide variety of marine mammal species. Advances in hardware and data collection have exponentially increased the volumes of passive acoustic data collected, such that discoveries are no...

Full description

Bibliographic Details
Main Authors: Ann N. Allen (6313358), Matt Harvey (672827), Lauren Harrell (10316123), Aren Jansen (10316126), Karlina P. Merkens (6404561), Carrie C. Wall (10316129), Julie Cattiau (10316132), Erin M. Oleson (8070218)
Format: Still Image
Language:unknown
Published: 2021
Subjects:
Online Access:https://doi.org/10.3389/fmars.2021.607321.s004
Description
Summary:Passive acoustic monitoring is a well-established tool for researching the occurrence, movements, and ecology of a wide variety of marine mammal species. Advances in hardware and data collection have exponentially increased the volumes of passive acoustic data collected, such that discoveries are now limited by the time required to analyze rather than collect the data. In order to address this limitation, we trained a deep convolutional neural network (CNN) to identify humpback whale song in over 187,000 h of acoustic data collected at 13 different monitoring sites in the North Pacific over a 14-year period. The model successfully detected 75 s audio segments containing humpback song with an average precision of 0.97 and average area under the receiver operating characteristic curve (AUC-ROC) of 0.992. The model output was used to analyze spatial and temporal patterns of humpback song, corroborating known seasonal patterns in the Hawaiian and Mariana Islands, including occurrence at remote monitoring sites beyond well-studied aggregations, as well as novel discovery of humpback whale song at Kingman Reef, at 5 ∘ North latitude. This study demonstrates the ability of a CNN trained on a small dataset to generalize well to a highly variable signal type across a diverse range of recording and noise conditions. We demonstrate the utility of active learning approaches for creating high-quality models in specialized domains where annotations are rare. These results validate the feasibility of applying deep learning models to identify highly variable signals across broad spatial and temporal scales, enabling new discoveries through combining large datasets with cutting edge tools.