Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository

Biodiversity assessments relying on DNA have increased rapidly over the last decade. However, the reliability of taxonomic assignments in metabarcoding studies is variable and affected by the reference databases and the assignment methods used. Species level assignments are usually considered as rel...

Full description

Bibliographic Details
Main Authors: Bourret, Audrey, Nozères, Claude, Parent, Eric, Parent, Geneviève J.
Format: Article in Journal/Newspaper
Language:unknown
Published: Pensoft Publishers 2023
Subjects:
Online Access:https://doi.org/10.3897/mbmg.7.98539
id ftzenodo:oai:zenodo.org:7676678
record_format openpolar
spelling ftzenodo:oai:zenodo.org:7676678 2024-09-15T18:26:23+00:00 Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository Bourret, Audrey Nozères, Claude Parent, Eric Parent, Geneviève J. 2023-02-23 https://doi.org/10.3897/mbmg.7.98539 unknown Pensoft Publishers https://doi.org/10.3897/mbmg.7.98539.suppl1 https://zenodo.org/communities/biosyslit https://doi.org/10.3897/mbmg.7.98539 oai:zenodo.org:7676678 info:eu-repo/semantics/openAccess Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode Metabarcoding and Metagenomics, 7, e98539, (2023-02-23) classifier cytochrome C oxidase I GenBank marine species metagenomics reference sequence library info:eu-repo/semantics/article 2023 ftzenodo https://doi.org/10.3897/mbmg.7.9853910.3897/mbmg.7.98539.suppl1 2024-07-25T15:26:53Z Biodiversity assessments relying on DNA have increased rapidly over the last decade. However, the reliability of taxonomic assignments in metabarcoding studies is variable and affected by the reference databases and the assignment methods used. Species level assignments are usually considered as reliable using regional libraries but unreliable using public repositories. In this study, we aimed to test this assumption for metazoan species detected in the Gulf of St. Lawrence in the Northwest Atlantic. We first created a regional library (GSL-rl) by data mining COI barcode sequences from BOLD, and included a reliability ranking system for species assignments. We then estimated 1) the accuracy and precision of the public repository NCBI-nt for species assignments using sequences from the regional library and 2) compared the detection and reliability of species assignments of a metabarcoding dataset using either NCBI-nt or the regional library and popular assignment methods. With NCBI-nt and sequences from the regional library, the BLAST-LCA (least common ancestor) method was the most precise method for species assignments, but the accuracy was higher with the BLAST-TopHit method (>80% over all taxa, between 70% and 90% amongst taxonomic groups). With the metabarcoding dataset, the reliability of species assignments was greater using GSL-rl compared to NCBI-nt. However, we also observed that the total number of reliable species assignments could be maximized using both GSL-rl and NCBI-nt with different optimized assignment methods. The use of a two-step approach for species assignments, i.e., using a regional library and a public repository, could improve the reliability and the number of detected species in metabarcoding studies. Article in Journal/Newspaper Northwest Atlantic Zenodo
institution Open Polar
collection Zenodo
op_collection_id ftzenodo
language unknown
topic classifier
cytochrome C oxidase I
GenBank
marine species
metagenomics
reference sequence library
spellingShingle classifier
cytochrome C oxidase I
GenBank
marine species
metagenomics
reference sequence library
Bourret, Audrey
Nozères, Claude
Parent, Eric
Parent, Geneviève J.
Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository
topic_facet classifier
cytochrome C oxidase I
GenBank
marine species
metagenomics
reference sequence library
description Biodiversity assessments relying on DNA have increased rapidly over the last decade. However, the reliability of taxonomic assignments in metabarcoding studies is variable and affected by the reference databases and the assignment methods used. Species level assignments are usually considered as reliable using regional libraries but unreliable using public repositories. In this study, we aimed to test this assumption for metazoan species detected in the Gulf of St. Lawrence in the Northwest Atlantic. We first created a regional library (GSL-rl) by data mining COI barcode sequences from BOLD, and included a reliability ranking system for species assignments. We then estimated 1) the accuracy and precision of the public repository NCBI-nt for species assignments using sequences from the regional library and 2) compared the detection and reliability of species assignments of a metabarcoding dataset using either NCBI-nt or the regional library and popular assignment methods. With NCBI-nt and sequences from the regional library, the BLAST-LCA (least common ancestor) method was the most precise method for species assignments, but the accuracy was higher with the BLAST-TopHit method (>80% over all taxa, between 70% and 90% amongst taxonomic groups). With the metabarcoding dataset, the reliability of species assignments was greater using GSL-rl compared to NCBI-nt. However, we also observed that the total number of reliable species assignments could be maximized using both GSL-rl and NCBI-nt with different optimized assignment methods. The use of a two-step approach for species assignments, i.e., using a regional library and a public repository, could improve the reliability and the number of detected species in metabarcoding studies.
format Article in Journal/Newspaper
author Bourret, Audrey
Nozères, Claude
Parent, Eric
Parent, Geneviève J.
author_facet Bourret, Audrey
Nozères, Claude
Parent, Eric
Parent, Geneviève J.
author_sort Bourret, Audrey
title Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository
title_short Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository
title_full Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository
title_fullStr Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository
title_full_unstemmed Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository
title_sort maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository
publisher Pensoft Publishers
publishDate 2023
url https://doi.org/10.3897/mbmg.7.98539
genre Northwest Atlantic
genre_facet Northwest Atlantic
op_source Metabarcoding and Metagenomics, 7, e98539, (2023-02-23)
op_relation https://doi.org/10.3897/mbmg.7.98539.suppl1
https://zenodo.org/communities/biosyslit
https://doi.org/10.3897/mbmg.7.98539
oai:zenodo.org:7676678
op_rights info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
op_doi https://doi.org/10.3897/mbmg.7.9853910.3897/mbmg.7.98539.suppl1
_version_ 1810466872557043712