Multimodal deep learning for cetacean distribution modeling of fin whales (Balaenoptera physalus) in the western Mediterranean Sea

International audience Cetacean Distribution Modeling (CDM) is used to quantify mobile marine species distributions and densities. It is essential to better understand and protect whales and their relatives. Current CDM approaches often fail in capturing general species-environment relationships, wh...

Full description

Bibliographic Details
Published in:Machine Learning
Main Authors: Cazau, Dorian, Nguyen Hong Duc, P., Druon, J.-N., Matwins, S., Fablet, Ronan
Other Authors: Equipe Observations Signal & Environnement (Lab-STICC_OSE), Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance (Lab-STICC), École Nationale d'Ingénieurs de Brest (ENIB)-Université de Bretagne Sud (UBS)-Université de Brest (UBO)-École Nationale Supérieure de Techniques Avancées Bretagne (ENSTA Bretagne)-Institut Mines-Télécom Paris (IMT)-Centre National de la Recherche Scientifique (CNRS)-Université Bretagne Loire (UBL)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom Paris (IMT)-École Nationale d'Ingénieurs de Brest (ENIB)-Université de Bretagne Sud (UBS)-Université de Brest (UBO)-École Nationale Supérieure de Techniques Avancées Bretagne (ENSTA Bretagne)-Institut Mines-Télécom Paris (IMT)-Centre National de la Recherche Scientifique (CNRS)-Université Bretagne Loire (UBL)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom Paris (IMT), École Nationale Supérieure de Techniques Avancées Bretagne (ENSTA Bretagne), Institut Jean Le Rond d'Alembert (DALEMBERT), Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), European Commission - Joint Research Centre Ispra (JRC), Dalhousie University Halifax, Département Mathematical and Electrical Engineering (IMT Atlantique - MEE), IMT Atlantique (IMT Atlantique), Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT), Interdisciplinary Graduate School for the Blue planet, ANR-17-EURE-0015,ISBlue,Interdisciplinary Graduate School for the Blue planet(2017)
Format: Article in Journal/Newspaper
Language:English
Published: HAL CCSD 2021
Subjects:
Gam
Online Access:https://hal.univ-brest.fr/hal-03324461
https://doi.org/10.1007/s10994-021-06029-z
Description
Summary:International audience Cetacean Distribution Modeling (CDM) is used to quantify mobile marine species distributions and densities. It is essential to better understand and protect whales and their relatives. Current CDM approaches often fail in capturing general species-environment relationships, which would be valid within a broader range of environmental conditions that characterize the surveyed regions. This paper aims at investigating the usefulness of deep learning based schemes, namely multi-task and transfer learning, in CDM. Co-training of a stochastic presence-background model on a classification task and a deterministic rule-based model on a regression task was performed. Whale presence-only records were used for the first task, and index outputs of a feeding habitat occurrence model for the second one. This new approach has been experimented through the study case of fin whales in the western Mediterranean Sea. To evaluate our approach, a new metric called True Positive rate per unit of Surface Habitat (TPSH) and an original multimodal fully-connected neural networks were developed. A Generalized Additive Model (GAM)—a standard CDM method—was also used as a reference for performance. Results show that our multi-task learning model improves both the feeding habitat model by 10.8% and data-driven models such as GAM by 16.5% on our TPSH metric in relative terms, revealing a higher accuracy of our approach in estimating whale presence. Such trends in results have been further supported by the use of two other independent datasets that forced models to generalize beyond their training dataset of species-environment relationships. Performance could be further improved by adopting more optimal thresholds as observed from Receiver Operating Characteristic curves, e.g. the multi-task learning model could reach absolute gains up to 10% in the median of the True Positive Rate while maintaining its habitat spatial spreading. Globally, our work confirmed our working hypothesis that expert information on whale ...