From citizen science to AI models: Advancing cetacean vocalization automatic detection through multi-annotator campaigns

Continuous underwater Passive Acoustic Monitoring (PAM) has emerged as a strong tool for cetacean research. To handle the vast volume of collected data, it is essential to employ automated detection and classification methods. The recent advancement of deep learning, involving model training and tes...

Full description

Bibliographic Details
Published in:Ecological Informatics
Main Authors: Dubus, Gabriel, Cazau, Dorian, Torterotot, Maƫlle, Gros-martial, Anatole, Hong Duc, Paul Nguyen, Adam, Olivier
Format: Article in Journal/Newspaper
Language:English
Published: Elsevier BV 2024
Subjects:
Online Access:https://archimer.ifremer.fr/doc/00889/100094/110389.pdf
https://archimer.ifremer.fr/doc/00889/100094/110390.jpg
https://archimer.ifremer.fr/doc/00889/100094/110391.jpg
https://doi.org/10.1016/j.ecoinf.2024.102642
https://archimer.ifremer.fr/doc/00889/100094/
Description
Summary:Continuous underwater Passive Acoustic Monitoring (PAM) has emerged as a strong tool for cetacean research. To handle the vast volume of collected data, it is essential to employ automated detection and classification methods. The recent advancement of deep learning, involving model training and testing, requires a large amount of labeled data. These labels are derived through the manual annotation of audio files often reliant on human experts. Based on an annotation campaign focusing on blue whale calls in the Indian Ocean involving 19 novice annotators and an expert in bioacoustics, this study explores the integration of novice annotators in marine bioacoustics research, through citizen science programs, which could drastically increase the size of labeled datasets and enhance the performance of detection and classification models. The analysis reveals distinctive annotation profiles influenced by the complexity of vocalizations and the annotators' strategies, ranging from conservative to permissive. To address the challenges of annotation discrepancies, Convolutional Neural Networks (CNNs) are trained on annotations from both novices and the expert. The results show variations in model performance. Our work highlights the importance of annotation guidelines encouraging a more conservative approach to improve overall annotation quality. In an effort to optimize the potential of multi-annotation and mitigate the presence of noisy labels, two annotation aggregation methods (majority voting and soft labeling) are proposed and tested. The results demonstrate that both methods, particularly when a sufficient number of annotators are involved, significantly improve model performance and reduce variability: the standard deviation of the area under PR and ROC curves fall under 0.02 for both vocalizations with 13 aggregated annotators, while it was at 0.17 and 0.21 for the Blue Whale Dcalls and 0.05 and 0.04 for the SEIO PBW vocalizations with all annotator separately. Moreover, these aggregation methods enable the ...