An accurate assignment test for extremely low-coverage whole-genome sequence data. ...

Genomic assignment tests can provide important diagnostic biological characteristics, such as population of origin or ecotype. Yet, assignment tests often rely on moderate- to high-coverage sequence data that can be difficult to obtain for fields such as molecular ecology and ancient DNA. We have de...

Full description

Bibliographic Details
Main Authors: Ferrari, Giada, Atmore, Lane M, Jentoft, Sissel, Jakobsen, Kjetill S, Makowiecki, Daniel, Barrett, James H, Star, Bastiaan
Format: Article in Journal/Newspaper
Language:English
Published: Wiley 2022
Subjects:
Online Access:https://dx.doi.org/10.17863/cam.78927
https://www.repository.cam.ac.uk/handle/1810/331473
Description
Summary:Genomic assignment tests can provide important diagnostic biological characteristics, such as population of origin or ecotype. Yet, assignment tests often rely on moderate- to high-coverage sequence data that can be difficult to obtain for fields such as molecular ecology and ancient DNA. We have developed a novel approach that efficiently assigns biologically relevant information (i.e., population identity or structural variants such as inversions) in extremely low-coverage sequence data. First, we generate databases from existing reference data using a subset of diagnostic single nucleotide polymorphisms (SNPs) associated with a biological characteristic. Low-coverage alignment files are subsequently compared to these databases to ascertain allelic state, yielding a joint probability for each association. To assess the efficacy of this approach, we assigned haplotypes and population identity in Heliconius butterflies, Atlantic herring, and Atlantic cod using chromosomal inversion sites and whole-genome ...