Endless Forams: >34,000 Modern Planktonic Foraminiferal Images for Taxonomic Training and Automated Species Recognition Using Convolutional Neural Networks

International audience Planktonic foraminiferal species identification is central to many paleoceanographic studies, from selecting species for geochemical research to elucidating the biotic dynamics of microfossil communities relevant to physical oceanographic processes and interconnected phenomena...

Full description

Bibliographic Details
Published in:Paleoceanography and Paleoclimatology
Main Authors: Hsiang, Allison, Brombacher, Anieke, Rillo, Marina, Mleneck‐vautravers, Maryline, Conn, Stephen, Lordsmith, Sian, Jentzen, Anna, Henehan, Michael, Metcalfe, Brett, Fenton, Isabel, Wade, Bridget, Fox, Lyndsey, Meilland, Julie, Davis, Catherine, Baranowski, Ulrike, Groeneveld, Jeroen, Edgar, Kirsty, Movellan, Aurore, Aze, Tracy, Dowsett, Harry, Miller, C. Giles, Rios, Nelson, Hull, Pincelli
Other Authors: Swedish Museum of Natural History (NRM), National Oceanography Centre Southampton (NOC), University of Southampton, Laboratoire des Sciences du Climat et de l'Environnement Gif-sur-Yvette (LSCE), Université de Versailles Saint-Quentin-en-Yvelines (UVSQ)-Institut national des sciences de l'Univers (INSU - CNRS)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Direction de Recherche Fondamentale (CEA) (DRF (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA), Vrije Universiteit Amsterdam Amsterdam (VU)
Format: Article in Journal/Newspaper
Language:English
Published: HAL CCSD 2019
Subjects:
Online Access:https://hal.science/hal-02975093
https://hal.science/hal-02975093/document
https://hal.science/hal-02975093/file/2019PA003612.pdf
https://doi.org/10.1029/2019PA003612
id ftuniparissaclay:oai:HAL:hal-02975093v1
record_format openpolar
institution Open Polar
collection Archives ouvertes de Paris-Saclay
op_collection_id ftuniparissaclay
language English
topic [SDU.OCEAN]Sciences of the Universe [physics]/Ocean
Atmosphere
[SDU.ENVI]Sciences of the Universe [physics]/Continental interfaces
environment
spellingShingle [SDU.OCEAN]Sciences of the Universe [physics]/Ocean
Atmosphere
[SDU.ENVI]Sciences of the Universe [physics]/Continental interfaces
environment
Hsiang, Allison
Brombacher, Anieke
Rillo, Marina
Mleneck‐vautravers, Maryline
Conn, Stephen
Lordsmith, Sian
Jentzen, Anna
Henehan, Michael
Metcalfe, Brett
Fenton, Isabel
Wade, Bridget
Fox, Lyndsey
Meilland, Julie
Davis, Catherine
Baranowski, Ulrike
Groeneveld, Jeroen
Edgar, Kirsty
Movellan, Aurore
Aze, Tracy
Dowsett, Harry
Miller, C. Giles
Rios, Nelson
Hull, Pincelli
Endless Forams: >34,000 Modern Planktonic Foraminiferal Images for Taxonomic Training and Automated Species Recognition Using Convolutional Neural Networks
topic_facet [SDU.OCEAN]Sciences of the Universe [physics]/Ocean
Atmosphere
[SDU.ENVI]Sciences of the Universe [physics]/Continental interfaces
environment
description International audience Planktonic foraminiferal species identification is central to many paleoceanographic studies, from selecting species for geochemical research to elucidating the biotic dynamics of microfossil communities relevant to physical oceanographic processes and interconnected phenomena such as climate change. However, few resources exist to train students in the difficult task of discerning amongst closely related species, resulting in diverging taxonomic schools that differ in species concepts and boundaries. This problem is exacerbated by the limited number of taxonomic experts. Here we document our initial progress toward removing these confounding and/or rate-limiting factors by generating the first extensive image library of modern planktonic foraminifera, providing digital taxonomic training tools and resources, and automating species-level taxonomic identification of planktonic foraminifera via machine learning using convolution neural networks. Experts identified 34,640 images of modern (extant) planktonic foraminifera to the species level. These images are served as species exemplars through the online portal Endless Forams (endlessforams.org) and a taxonomic training portal hosted on the citizen science platform Zooniverse (zooniverse.org/projects/ahsiang/ endless-forams/). A supervised machine learning classifier was then trained with~27,000 images of these identified planktonic foraminifera. The best-performing model provided the correct species name for an image in the validation set 87.4% of the time and included the correct name in its top three guesses 97.7% of the time. Together, these resources provide a rigorous set of training tools in modern planktonic foraminiferal taxonomy and a means of rapidly generating assemblage data via machine learning in future studies for applications such as paleotemperature reconstruction.
author2 Swedish Museum of Natural History (NRM)
National Oceanography Centre Southampton (NOC)
University of Southampton
Laboratoire des Sciences du Climat et de l'Environnement Gif-sur-Yvette (LSCE)
Université de Versailles Saint-Quentin-en-Yvelines (UVSQ)-Institut national des sciences de l'Univers (INSU - CNRS)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Direction de Recherche Fondamentale (CEA) (DRF (CEA))
Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)
Vrije Universiteit Amsterdam Amsterdam (VU)
format Article in Journal/Newspaper
author Hsiang, Allison
Brombacher, Anieke
Rillo, Marina
Mleneck‐vautravers, Maryline
Conn, Stephen
Lordsmith, Sian
Jentzen, Anna
Henehan, Michael
Metcalfe, Brett
Fenton, Isabel
Wade, Bridget
Fox, Lyndsey
Meilland, Julie
Davis, Catherine
Baranowski, Ulrike
Groeneveld, Jeroen
Edgar, Kirsty
Movellan, Aurore
Aze, Tracy
Dowsett, Harry
Miller, C. Giles
Rios, Nelson
Hull, Pincelli
author_facet Hsiang, Allison
Brombacher, Anieke
Rillo, Marina
Mleneck‐vautravers, Maryline
Conn, Stephen
Lordsmith, Sian
Jentzen, Anna
Henehan, Michael
Metcalfe, Brett
Fenton, Isabel
Wade, Bridget
Fox, Lyndsey
Meilland, Julie
Davis, Catherine
Baranowski, Ulrike
Groeneveld, Jeroen
Edgar, Kirsty
Movellan, Aurore
Aze, Tracy
Dowsett, Harry
Miller, C. Giles
Rios, Nelson
Hull, Pincelli
author_sort Hsiang, Allison
title Endless Forams: >34,000 Modern Planktonic Foraminiferal Images for Taxonomic Training and Automated Species Recognition Using Convolutional Neural Networks
title_short Endless Forams: >34,000 Modern Planktonic Foraminiferal Images for Taxonomic Training and Automated Species Recognition Using Convolutional Neural Networks
title_full Endless Forams: >34,000 Modern Planktonic Foraminiferal Images for Taxonomic Training and Automated Species Recognition Using Convolutional Neural Networks
title_fullStr Endless Forams: >34,000 Modern Planktonic Foraminiferal Images for Taxonomic Training and Automated Species Recognition Using Convolutional Neural Networks
title_full_unstemmed Endless Forams: >34,000 Modern Planktonic Foraminiferal Images for Taxonomic Training and Automated Species Recognition Using Convolutional Neural Networks
title_sort endless forams: >34,000 modern planktonic foraminiferal images for taxonomic training and automated species recognition using convolutional neural networks
publisher HAL CCSD
publishDate 2019
url https://hal.science/hal-02975093
https://hal.science/hal-02975093/document
https://hal.science/hal-02975093/file/2019PA003612.pdf
https://doi.org/10.1029/2019PA003612
genre Planktonic foraminifera
genre_facet Planktonic foraminifera
op_source ISSN: 2572-4525
EISSN: 1944-9186
Paleoceanography and Paleoclimatology
https://hal.science/hal-02975093
Paleoceanography and Paleoclimatology, 2019, 34 (7), pp.1157-1177. ⟨10.1029/2019PA003612⟩
op_relation info:eu-repo/semantics/altIdentifier/doi/10.1029/2019PA003612
hal-02975093
https://hal.science/hal-02975093
https://hal.science/hal-02975093/document
https://hal.science/hal-02975093/file/2019PA003612.pdf
doi:10.1029/2019PA003612
op_rights info:eu-repo/semantics/OpenAccess
op_doi https://doi.org/10.1029/2019PA003612
container_title Paleoceanography and Paleoclimatology
container_volume 34
container_issue 7
container_start_page 1157
op_container_end_page 1177
_version_ 1812180587388076032
spelling ftuniparissaclay:oai:HAL:hal-02975093v1 2024-10-06T13:52:15+00:00 Endless Forams: >34,000 Modern Planktonic Foraminiferal Images for Taxonomic Training and Automated Species Recognition Using Convolutional Neural Networks Hsiang, Allison Brombacher, Anieke Rillo, Marina Mleneck‐vautravers, Maryline Conn, Stephen Lordsmith, Sian Jentzen, Anna Henehan, Michael Metcalfe, Brett Fenton, Isabel Wade, Bridget Fox, Lyndsey Meilland, Julie Davis, Catherine Baranowski, Ulrike Groeneveld, Jeroen Edgar, Kirsty Movellan, Aurore Aze, Tracy Dowsett, Harry Miller, C. Giles Rios, Nelson Hull, Pincelli Swedish Museum of Natural History (NRM) National Oceanography Centre Southampton (NOC) University of Southampton Laboratoire des Sciences du Climat et de l'Environnement Gif-sur-Yvette (LSCE) Université de Versailles Saint-Quentin-en-Yvelines (UVSQ)-Institut national des sciences de l'Univers (INSU - CNRS)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Direction de Recherche Fondamentale (CEA) (DRF (CEA)) Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA) Vrije Universiteit Amsterdam Amsterdam (VU) 2019-07-30 https://hal.science/hal-02975093 https://hal.science/hal-02975093/document https://hal.science/hal-02975093/file/2019PA003612.pdf https://doi.org/10.1029/2019PA003612 en eng HAL CCSD American Geophysical Union info:eu-repo/semantics/altIdentifier/doi/10.1029/2019PA003612 hal-02975093 https://hal.science/hal-02975093 https://hal.science/hal-02975093/document https://hal.science/hal-02975093/file/2019PA003612.pdf doi:10.1029/2019PA003612 info:eu-repo/semantics/OpenAccess ISSN: 2572-4525 EISSN: 1944-9186 Paleoceanography and Paleoclimatology https://hal.science/hal-02975093 Paleoceanography and Paleoclimatology, 2019, 34 (7), pp.1157-1177. ⟨10.1029/2019PA003612⟩ [SDU.OCEAN]Sciences of the Universe [physics]/Ocean Atmosphere [SDU.ENVI]Sciences of the Universe [physics]/Continental interfaces environment info:eu-repo/semantics/article Journal articles 2019 ftuniparissaclay https://doi.org/10.1029/2019PA003612 2024-09-06T00:30:31Z International audience Planktonic foraminiferal species identification is central to many paleoceanographic studies, from selecting species for geochemical research to elucidating the biotic dynamics of microfossil communities relevant to physical oceanographic processes and interconnected phenomena such as climate change. However, few resources exist to train students in the difficult task of discerning amongst closely related species, resulting in diverging taxonomic schools that differ in species concepts and boundaries. This problem is exacerbated by the limited number of taxonomic experts. Here we document our initial progress toward removing these confounding and/or rate-limiting factors by generating the first extensive image library of modern planktonic foraminifera, providing digital taxonomic training tools and resources, and automating species-level taxonomic identification of planktonic foraminifera via machine learning using convolution neural networks. Experts identified 34,640 images of modern (extant) planktonic foraminifera to the species level. These images are served as species exemplars through the online portal Endless Forams (endlessforams.org) and a taxonomic training portal hosted on the citizen science platform Zooniverse (zooniverse.org/projects/ahsiang/ endless-forams/). A supervised machine learning classifier was then trained with~27,000 images of these identified planktonic foraminifera. The best-performing model provided the correct species name for an image in the validation set 87.4% of the time and included the correct name in its top three guesses 97.7% of the time. Together, these resources provide a rigorous set of training tools in modern planktonic foraminiferal taxonomy and a means of rapidly generating assemblage data via machine learning in future studies for applications such as paleotemperature reconstruction. Article in Journal/Newspaper Planktonic foraminifera Archives ouvertes de Paris-Saclay Paleoceanography and Paleoclimatology 34 7 1157 1177