Automated whistle extraction for precise scaled annotations

International audience Whistles produced by odontocete species can be used as indicators of species, density or individual identification, as well as for communication studies. These vocalisations in spectrogram representations vary in a wide range of time-frequency shapes. Their annotation is a cha...

Full description

Bibliographic Details
Main Authors: Lehnhoff, Loïc, Mérigot, Bastien, Glotin, Hervé
Other Authors: MARine Biodiversity Exploitation and Conservation - MARBEC (UMR MARBEC ), Institut de Recherche pour le Développement (IRD)-Institut Français de Recherche pour l'Exploitation de la Mer (IFREMER)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Laboratoire d'Informatique et des Systèmes (LIS) (Marseille, Toulon) (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), DYNamiques de l’Information (DYNI), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), University of toulon, CIAN, The DOLPHINFREE project is funded by the European Maritime and Fisheries Fund (EMFF) and France Filière Pêche (FFP). L.L.’s PhD grant is provided by Montpellier University., ANR-20-CHIA-0014,ADSIL,Écoute intelligente sous-marine avancée(2020), ANR-21-CE04-0020,ULP-COCHLEA,Cochlée 3D, Intelligente et ultra basse consommation(2021)
Format: Conference Object
Language:English
Published: HAL CCSD 2024
Subjects:
Online Access:https://hal.science/hal-04650230
https://hal.science/hal-04650230v2/document
https://hal.science/hal-04650230v2/file/poster_ECS35_LL-2.pdf
id ftunivtoulon:oai:HAL:hal-04650230v2
record_format openpolar
institution Open Polar
collection Université de Toulon: HAL
op_collection_id ftunivtoulon
language English
topic Contour extraction
Artificial intelligence
Dolphins
[SPI.ACOU]Engineering Sciences [physics]/Acoustics [physics.class-ph]
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
[SDV]Life Sciences [q-bio]
spellingShingle Contour extraction
Artificial intelligence
Dolphins
[SPI.ACOU]Engineering Sciences [physics]/Acoustics [physics.class-ph]
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
[SDV]Life Sciences [q-bio]
Lehnhoff, Loïc
Mérigot, Bastien
Glotin, Hervé
Automated whistle extraction for precise scaled annotations
topic_facet Contour extraction
Artificial intelligence
Dolphins
[SPI.ACOU]Engineering Sciences [physics]/Acoustics [physics.class-ph]
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
[SDV]Life Sciences [q-bio]
description International audience Whistles produced by odontocete species can be used as indicators of species, density or individual identification, as well as for communication studies. These vocalisations in spectrogram representations vary in a wide range of time-frequency shapes. Their annotation is a challenge that is often time-consuming and labour-intensive. Existing automated contour extraction solutions are improving, but still struggle to be accurate in the context of overlapping vocalisations, which is common when studying groups of free-ranging small cetaceans. To address this problem, we developed a 2-step method in Python for detecting and extracting whistles within an audio recording. This method requires a relatively small number of manual annotations (< 2500) and was tested on recordings of free-ranging common dolphins (Delphinus delphis) in the Northwest Atlantic. For whistle detection, we selected the YOLOv8 model (a popular state-of-the-art object detection model) to predict bounding boxes around whistles in spectrograms. YOLOv8 is designed to detect complex objects in landscape images. It therefore performs well on simple spectrogram images, even in noisy situations. YOLOv8 has been trained to detect and classify between two categories: isolated whistles and overlapping whistles. Bounding boxes containing overlaps are given to the user for manual contour extraction using a custom-made annotation tool. Bounding boxes containing isolated whistles are fed into a deep learning regression model that uses the isolated image of each whistle to predict the contours. The performance of the regression depends heavily on the quality of the manual annotations, but it can generalise from them to predict any whistle shape. Overall, this method achieves a satisfactory compromise between annotation speed and prediction accuracy: simple whistles are extracted automatically and only the most complex annotations (i.e. overlapping whistles) are handled by the user.
author2 MARine Biodiversity Exploitation and Conservation - MARBEC (UMR MARBEC )
Institut de Recherche pour le Développement (IRD)-Institut Français de Recherche pour l'Exploitation de la Mer (IFREMER)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)
Laboratoire d'Informatique et des Systèmes (LIS) (Marseille, Toulon) (LIS)
Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)
DYNamiques de l’Information (DYNI)
Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)
University of toulon, CIAN
The DOLPHINFREE project is funded by the European Maritime and Fisheries Fund (EMFF) and France Filière Pêche (FFP). L.L.’s PhD grant is provided by Montpellier University.
ANR-20-CHIA-0014,ADSIL,Écoute intelligente sous-marine avancée(2020)
ANR-21-CE04-0020,ULP-COCHLEA,Cochlée 3D, Intelligente et ultra basse consommation(2021)
format Conference Object
author Lehnhoff, Loïc
Mérigot, Bastien
Glotin, Hervé
author_facet Lehnhoff, Loïc
Mérigot, Bastien
Glotin, Hervé
author_sort Lehnhoff, Loïc
title Automated whistle extraction for precise scaled annotations
title_short Automated whistle extraction for precise scaled annotations
title_full Automated whistle extraction for precise scaled annotations
title_fullStr Automated whistle extraction for precise scaled annotations
title_full_unstemmed Automated whistle extraction for precise scaled annotations
title_sort automated whistle extraction for precise scaled annotations
publisher HAL CCSD
publishDate 2024
url https://hal.science/hal-04650230
https://hal.science/hal-04650230v2/document
https://hal.science/hal-04650230v2/file/poster_ECS35_LL-2.pdf
op_coverage Catania, Italy
genre Northwest Atlantic
genre_facet Northwest Atlantic
op_source 35th European Cetacean Society Conference
https://hal.science/hal-04650230
35th European Cetacean Society Conference, Apr 2024, Catania, Italy
op_relation hal-04650230
https://hal.science/hal-04650230
https://hal.science/hal-04650230v2/document
https://hal.science/hal-04650230v2/file/poster_ECS35_LL-2.pdf
op_rights http://creativecommons.org/licenses/by-nc/
info:eu-repo/semantics/OpenAccess
_version_ 1809931069046128640
spelling ftunivtoulon:oai:HAL:hal-04650230v2 2024-09-09T19:59:58+00:00 Automated whistle extraction for precise scaled annotations Lehnhoff, Loïc Mérigot, Bastien Glotin, Hervé MARine Biodiversity Exploitation and Conservation - MARBEC (UMR MARBEC ) Institut de Recherche pour le Développement (IRD)-Institut Français de Recherche pour l'Exploitation de la Mer (IFREMER)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM) Laboratoire d'Informatique et des Systèmes (LIS) (Marseille, Toulon) (LIS) Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS) DYNamiques de l’Information (DYNI) Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS)-Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS) University of toulon, CIAN The DOLPHINFREE project is funded by the European Maritime and Fisheries Fund (EMFF) and France Filière Pêche (FFP). L.L.’s PhD grant is provided by Montpellier University. ANR-20-CHIA-0014,ADSIL,Écoute intelligente sous-marine avancée(2020) ANR-21-CE04-0020,ULP-COCHLEA,Cochlée 3D, Intelligente et ultra basse consommation(2021) Catania, Italy 2024-04-10 https://hal.science/hal-04650230 https://hal.science/hal-04650230v2/document https://hal.science/hal-04650230v2/file/poster_ECS35_LL-2.pdf en eng HAL CCSD hal-04650230 https://hal.science/hal-04650230 https://hal.science/hal-04650230v2/document https://hal.science/hal-04650230v2/file/poster_ECS35_LL-2.pdf http://creativecommons.org/licenses/by-nc/ info:eu-repo/semantics/OpenAccess 35th European Cetacean Society Conference https://hal.science/hal-04650230 35th European Cetacean Society Conference, Apr 2024, Catania, Italy Contour extraction Artificial intelligence Dolphins [SPI.ACOU]Engineering Sciences [physics]/Acoustics [physics.class-ph] [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] [SDV]Life Sciences [q-bio] info:eu-repo/semantics/conferenceObject Conference poster 2024 ftunivtoulon 2024-07-22T23:38:39Z International audience Whistles produced by odontocete species can be used as indicators of species, density or individual identification, as well as for communication studies. These vocalisations in spectrogram representations vary in a wide range of time-frequency shapes. Their annotation is a challenge that is often time-consuming and labour-intensive. Existing automated contour extraction solutions are improving, but still struggle to be accurate in the context of overlapping vocalisations, which is common when studying groups of free-ranging small cetaceans. To address this problem, we developed a 2-step method in Python for detecting and extracting whistles within an audio recording. This method requires a relatively small number of manual annotations (< 2500) and was tested on recordings of free-ranging common dolphins (Delphinus delphis) in the Northwest Atlantic. For whistle detection, we selected the YOLOv8 model (a popular state-of-the-art object detection model) to predict bounding boxes around whistles in spectrograms. YOLOv8 is designed to detect complex objects in landscape images. It therefore performs well on simple spectrogram images, even in noisy situations. YOLOv8 has been trained to detect and classify between two categories: isolated whistles and overlapping whistles. Bounding boxes containing overlaps are given to the user for manual contour extraction using a custom-made annotation tool. Bounding boxes containing isolated whistles are fed into a deep learning regression model that uses the isolated image of each whistle to predict the contours. The performance of the regression depends heavily on the quality of the manual annotations, but it can generalise from them to predict any whistle shape. Overall, this method achieves a satisfactory compromise between annotation speed and prediction accuracy: simple whistles are extracted automatically and only the most complex annotations (i.e. overlapping whistles) are handled by the user. Conference Object Northwest Atlantic Université de Toulon: HAL