How many specimens make a sufficient training set for automated three-dimensional feature extraction?

Deep learning has emerged as a robust tool for automating feature extraction from three-dimensional images, offering an efficient alternative to labour-intensive and potentially biased manual image segmentation methods. However, there has been limited exploration into the optimal training set sizes,...

Full description

Bibliographic Details
Published in:	Royal Society Open Science
Main Authors:	Mulqueeney, James M., Searle-Barnes, Alex, Brombacher, Anieke, Sweeney, Marisa, Goswami, Anjali, Ezard, Thomas H. G.
Other Authors:	Natural Environment Research Council, Leverhulme Trust, European Research Council
Format:	Article in Journal/Newspaper
Language:	English
Published:	The Royal Society 2024
Subjects:	Planktonic foraminifera
Online Access:	http://dx.doi.org/10.1098/rsos.240113 https://royalsocietypublishing.org/doi/pdf/10.1098/rsos.240113 https://royalsocietypublishing.org/doi/full-xml/10.1098/rsos.240113

id	crroyalsociety:10.1098/rsos.240113
record_format	openpolar
spelling	crroyalsociety:10.1098/rsos.240113 2024-09-15T18:31:03+00:00 How many specimens make a sufficient training set for automated three-dimensional feature extraction? Mulqueeney, James M. Searle-Barnes, Alex Brombacher, Anieke Sweeney, Marisa Goswami, Anjali Ezard, Thomas H. G. Natural Environment Research Council Leverhulme Trust European Research Council 2024 http://dx.doi.org/10.1098/rsos.240113 https://royalsocietypublishing.org/doi/pdf/10.1098/rsos.240113 https://royalsocietypublishing.org/doi/full-xml/10.1098/rsos.240113 en eng The Royal Society http://creativecommons.org/licenses/by/4.0/ http://creativecommons.org/licenses/by/4.0/ Royal Society Open Science volume 11, issue 6 ISSN 2054-5703 journal-article 2024 crroyalsociety https://doi.org/10.1098/rsos.240113 2024-08-19T04:24:54Z Deep learning has emerged as a robust tool for automating feature extraction from three-dimensional images, offering an efficient alternative to labour-intensive and potentially biased manual image segmentation methods. However, there has been limited exploration into the optimal training set sizes, including assessing whether artficial expansion by data augmentation can achieve consistent results in less time and how consistent these benefits are across different types of traits. In this study, we manually segmented 50 planktonic foraminifera specimens from the genus Menardella to determine the minimum number of training images required to produce accurate volumetric and shape data from internal and external structures. The results reveal unsurprisingly that deep learning models improve with a larger number of training images with eight specimens being required to achieve 95% accuracy. Furthermore, data augmentation can enhance network accuracy by up to 8.0%. Notably, predicting both volumetric and shape measurements for the internal structure poses a greater challenge compared with the external structure, owing to low contrast differences between different materials and increased geometric complexity. These results provide novel insight into optimal training set sizes for precise image segmentation of diverse traits and highlight the potential of data augmentation for enhancing multivariate feature extraction from three-dimensional images. Article in Journal/Newspaper Planktonic foraminifera The Royal Society Royal Society Open Science 11 6
institution	Open Polar
collection	The Royal Society
op_collection_id	crroyalsociety
language	English
description	Deep learning has emerged as a robust tool for automating feature extraction from three-dimensional images, offering an efficient alternative to labour-intensive and potentially biased manual image segmentation methods. However, there has been limited exploration into the optimal training set sizes, including assessing whether artficial expansion by data augmentation can achieve consistent results in less time and how consistent these benefits are across different types of traits. In this study, we manually segmented 50 planktonic foraminifera specimens from the genus Menardella to determine the minimum number of training images required to produce accurate volumetric and shape data from internal and external structures. The results reveal unsurprisingly that deep learning models improve with a larger number of training images with eight specimens being required to achieve 95% accuracy. Furthermore, data augmentation can enhance network accuracy by up to 8.0%. Notably, predicting both volumetric and shape measurements for the internal structure poses a greater challenge compared with the external structure, owing to low contrast differences between different materials and increased geometric complexity. These results provide novel insight into optimal training set sizes for precise image segmentation of diverse traits and highlight the potential of data augmentation for enhancing multivariate feature extraction from three-dimensional images.
author2	Natural Environment Research Council Leverhulme Trust European Research Council
format	Article in Journal/Newspaper
author	Mulqueeney, James M. Searle-Barnes, Alex Brombacher, Anieke Sweeney, Marisa Goswami, Anjali Ezard, Thomas H. G.
spellingShingle	Mulqueeney, James M. Searle-Barnes, Alex Brombacher, Anieke Sweeney, Marisa Goswami, Anjali Ezard, Thomas H. G. How many specimens make a sufficient training set for automated three-dimensional feature extraction?
author_facet	Mulqueeney, James M. Searle-Barnes, Alex Brombacher, Anieke Sweeney, Marisa Goswami, Anjali Ezard, Thomas H. G.
author_sort	Mulqueeney, James M.
title	How many specimens make a sufficient training set for automated three-dimensional feature extraction?
title_short	How many specimens make a sufficient training set for automated three-dimensional feature extraction?
title_full	How many specimens make a sufficient training set for automated three-dimensional feature extraction?
title_fullStr	How many specimens make a sufficient training set for automated three-dimensional feature extraction?
title_full_unstemmed	How many specimens make a sufficient training set for automated three-dimensional feature extraction?
title_sort	how many specimens make a sufficient training set for automated three-dimensional feature extraction?
publisher	The Royal Society
publishDate	2024
url	http://dx.doi.org/10.1098/rsos.240113 https://royalsocietypublishing.org/doi/pdf/10.1098/rsos.240113 https://royalsocietypublishing.org/doi/full-xml/10.1098/rsos.240113
genre	Planktonic foraminifera
genre_facet	Planktonic foraminifera
op_source	Royal Society Open Science volume 11, issue 6 ISSN 2054-5703
op_rights	http://creativecommons.org/licenses/by/4.0/ http://creativecommons.org/licenses/by/4.0/
op_doi	https://doi.org/10.1098/rsos.240113
container_title	Royal Society Open Science
container_volume	11
container_issue	6
_version_	1810472648249966592

How many specimens make a sufficient training set for automated three-dimensional feature extraction?

Similar Items