Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository

Biodiversity assessments relying on DNA have increased rapidly over the last decade. However, the reliability of taxonomic assignments in metabarcoding studies is variable and affected by the reference databases and the assignment methods used. Species level assignments are usually considered as rel...

Full description

Bibliographic Details
Published in:Metabarcoding and Metagenomics
Main Authors: Bourret,Audrey, Nozeres,Claude, Eric,Parent, Parent,Genevieve J.
Format: Article in Journal/Newspaper
Language:English
Published: Pensoft Publishers 2023
Subjects:
Online Access:https://doi.org/10.3897/mbmg.7.98539
https://mbmg.pensoft.net/article/98539/
id ftpensoft:10.3897/mbmg.7.98539
record_format openpolar
spelling ftpensoft:10.3897/mbmg.7.98539 2023-05-15T17:45:43+02:00 Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository Bourret,Audrey Nozeres,Claude Eric,Parent Parent,Genevieve J. 2023 text/html https://doi.org/10.3897/mbmg.7.98539 https://mbmg.pensoft.net/article/98539/ en eng Pensoft Publishers info:eu-repo/semantics/altIdentifier/eissn/2534-9708 info:eu-repo/semantics/openAccess Metabarcoding and Metagenomics 7: e98539 classifier cytochrome C oxidase I GenBank marine species metagenomics reference sequence library Research Article 2023 ftpensoft https://doi.org/10.3897/mbmg.7.98539 2023-02-28T01:00:56Z Biodiversity assessments relying on DNA have increased rapidly over the last decade. However, the reliability of taxonomic assignments in metabarcoding studies is variable and affected by the reference databases and the assignment methods used. Species level assignments are usually considered as reliable using regional libraries but unreliable using public repositories. In this study, we aimed to test this assumption for metazoan species detected in the Gulf of St. Lawrence in the Northwest Atlantic. We first created a regional library (GSL-rl) by data mining COI barcode sequences from BOLD, and included a reliability ranking system for species assignments. We then estimated 1) the accuracy and precision of the public repository NCBI-nt for species assignments using sequences from the regional library and 2) compared the detection and reliability of species assignments of a metabarcoding dataset using either NCBI-nt or the regional library and popular assignment methods. With NCBI-nt and sequences from the regional library, the BLAST-LCA (least common ancestor) method was the most precise method for species assignments, but the accuracy was higher with the BLAST-TopHit method (>80% over all taxa, between 70% and 90% amongst taxonomic groups). With the metabarcoding dataset, the reliability of species assignments was greater using GSL-rl compared to NCBI-nt. However, we also observed that the total number of reliable species assignments could be maximized using both GSL-rl and NCBI-nt with different optimized assignment methods. The use of a two-step approach for species assignments, i.e., using a regional library and a public repository, could improve the reliability and the number of detected species in metabarcoding studies. Article in Journal/Newspaper Northwest Atlantic Pensoft Publishers Metabarcoding and Metagenomics 7
institution Open Polar
collection Pensoft Publishers
op_collection_id ftpensoft
language English
topic classifier
cytochrome C oxidase I
GenBank
marine species
metagenomics
reference sequence library
spellingShingle classifier
cytochrome C oxidase I
GenBank
marine species
metagenomics
reference sequence library
Bourret,Audrey
Nozeres,Claude
Eric,Parent
Parent,Genevieve J.
Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository
topic_facet classifier
cytochrome C oxidase I
GenBank
marine species
metagenomics
reference sequence library
description Biodiversity assessments relying on DNA have increased rapidly over the last decade. However, the reliability of taxonomic assignments in metabarcoding studies is variable and affected by the reference databases and the assignment methods used. Species level assignments are usually considered as reliable using regional libraries but unreliable using public repositories. In this study, we aimed to test this assumption for metazoan species detected in the Gulf of St. Lawrence in the Northwest Atlantic. We first created a regional library (GSL-rl) by data mining COI barcode sequences from BOLD, and included a reliability ranking system for species assignments. We then estimated 1) the accuracy and precision of the public repository NCBI-nt for species assignments using sequences from the regional library and 2) compared the detection and reliability of species assignments of a metabarcoding dataset using either NCBI-nt or the regional library and popular assignment methods. With NCBI-nt and sequences from the regional library, the BLAST-LCA (least common ancestor) method was the most precise method for species assignments, but the accuracy was higher with the BLAST-TopHit method (>80% over all taxa, between 70% and 90% amongst taxonomic groups). With the metabarcoding dataset, the reliability of species assignments was greater using GSL-rl compared to NCBI-nt. However, we also observed that the total number of reliable species assignments could be maximized using both GSL-rl and NCBI-nt with different optimized assignment methods. The use of a two-step approach for species assignments, i.e., using a regional library and a public repository, could improve the reliability and the number of detected species in metabarcoding studies.
format Article in Journal/Newspaper
author Bourret,Audrey
Nozeres,Claude
Eric,Parent
Parent,Genevieve J.
author_facet Bourret,Audrey
Nozeres,Claude
Eric,Parent
Parent,Genevieve J.
author_sort Bourret,Audrey
title Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository
title_short Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository
title_full Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository
title_fullStr Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository
title_full_unstemmed Maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository
title_sort maximizing the reliability and the number of species assignments in metabarcoding studies using a curated regional library and a public repository
publisher Pensoft Publishers
publishDate 2023
url https://doi.org/10.3897/mbmg.7.98539
https://mbmg.pensoft.net/article/98539/
genre Northwest Atlantic
genre_facet Northwest Atlantic
op_source Metabarcoding and Metagenomics 7: e98539
op_relation info:eu-repo/semantics/altIdentifier/eissn/2534-9708
op_rights info:eu-repo/semantics/openAccess
op_doi https://doi.org/10.3897/mbmg.7.98539
container_title Metabarcoding and Metagenomics
container_volume 7
_version_ 1766148933256478720