Artificial intelligence applied to the classification of eight middle Eocene species of the genus Podocyrtis (polycystine radiolaria)

International audience Abstract. This study evaluates the application of artificial intelligence (AI) to the automatic classification of radiolarians and uses as an example eight distinct morphospecies of the Eocene radiolarian genus Podocyrtis, which are part of three different evolutionary lineage...

Full description

Bibliographic Details
Published in:Journal of Micropalaeontology
Main Authors: Carlsson, Veronica, Danelian, Taniel, Boulet, Pierre, Devienne, Philippe, Laforge, Aurelien, Renaudie, Johan
Other Authors: Évolution, Écologie et Paléontologie (Evo-Eco-Paleo) - UMR 8198 (Evo-Eco-Paléo (EEP)), Université de Lille-Centre National de la Recherche Scientifique (CNRS), Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 (CRIStAL), Centrale Lille-Université de Lille-Centre National de la Recherche Scientifique (CNRS), Museum für Naturkunde Berlin, European Project: 847568,H2020,H2020-MSCA-COFUND-2018,PEARL(2019)
Format: Article in Journal/Newspaper
Language:English
Published: HAL CCSD 2022
Subjects:
Online Access:https://hal.science/hal-03962996
https://hal.science/hal-03962996/document
https://hal.science/hal-03962996/file/jm-41-165-2022.pdf
https://doi.org/10.5194/jm-41-165-2022
Description
Summary:International audience Abstract. This study evaluates the application of artificial intelligence (AI) to the automatic classification of radiolarians and uses as an example eight distinct morphospecies of the Eocene radiolarian genus Podocyrtis, which are part of three different evolutionary lineages and are useful in biostratigraphy. The samples used in this study were recovered from the equatorial Atlantic (ODP Leg 207) and were supplemented with some samples coming from the North Atlantic and Indian Oceans. To create an automatic classification tool, numerous images of the investigated species were needed to train a MobileNet convolutional neural network entirely coded in Python. Three different datasets were obtained. The first one consists of a mixture of broken and complete specimens, some of which sometimes appear blurry. The second and third datasets were leveled down into two further steps, which excludes broken and blurry specimens while increasing the quality. The convolutional neural network randomly selected 85 % of all specimens for training, while the remaining 15 % were used for validation. The MobileNet architecture had an overall accuracy of about 91 % for all datasets. Three predicational models were thereafter created, which had been trained on each dataset and worked well for classification of Podocyrtis coming from the Indian Ocean (Madingley Rise, ODP Leg 115, Hole 711A) and the western North Atlantic Ocean (New Jersey slope, DSDP Leg 95, Hole 612 and Blake Nose, ODP Leg 171B, Hole 1051A). These samples also provided clearer images since they were mounted with Canada balsam rather than Norland epoxy. In spite of some morphological differences encountered in different parts of the world's oceans and differences in image quality, most species could be correctly classified or at least classified with a neighboring species along a lineage. Classification improved slightly for some species by cropping and/or removing background particles of images which did not segment properly in the image ...