AD-LIBS: inferring ancestry across hybrid genomes using low-coverage sequence data.

BackgroundInferring the ancestry of each region of admixed individuals' genomes is useful in studies ranging from disease gene mapping to speciation genetics. Current methods require high-coverage genotype data and phased reference panels, and are therefore inappropriate for many data sets. We...

Full description

Bibliographic Details
Main Authors: Schaefer, Nathan K, Shapiro, Beth, Green, Richard E
Format: Article in Journal/Newspaper
Language:unknown
Published: eScholarship, University of California 2017
Subjects:
Online Access:https://escholarship.org/uc/item/04j576r0
id ftcdlib:oai:escholarship.org/ark:/13030/qt04j576r0
record_format openpolar
spelling ftcdlib:oai:escholarship.org/ark:/13030/qt04j576r0 2023-05-15T18:01:45+02:00 AD-LIBS: inferring ancestry across hybrid genomes using low-coverage sequence data. Schaefer, Nathan K Shapiro, Beth Green, Richard E 203 2017-04-04 application/pdf https://escholarship.org/uc/item/04j576r0 unknown eScholarship, University of California qt04j576r0 https://escholarship.org/uc/item/04j576r0 public BMC bioinformatics, vol 18, iss 1 Animals Ursidae Humans Markov Chains Hybridization Genetic Polymorphism Single Nucleotide Genome Software Genetic Speciation Admixture Ancestry Bears Speciation Genetics Human Genome Biological Sciences Information and Computing Sciences Mathematical Sciences Bioinformatics article 2017 ftcdlib 2021-01-01T18:58:53Z BackgroundInferring the ancestry of each region of admixed individuals' genomes is useful in studies ranging from disease gene mapping to speciation genetics. Current methods require high-coverage genotype data and phased reference panels, and are therefore inappropriate for many data sets. We present a software application, AD-LIBS, that uses a hidden Markov model to infer ancestry across hybrid genomes without requiring variant calling or phasing. This approach is useful for non-model organisms and in cases of low-coverage data, such as ancient DNA.ResultsWe demonstrate the utility of AD-LIBS with synthetic data. We then use AD-LIBS to infer ancestry in two published data sets: European human genomes with Neanderthal ancestry and brown bear genomes with polar bear ancestry. AD-LIBS correctly infers 87-91% of ancestry in simulations and produces ancestry maps that agree with published results and global ancestry estimates in humans. In brown bears, we find more polar bear ancestry than has been published previously, using both AD-LIBS and an existing software application for local ancestry inference, HAPMIX. We validate AD-LIBS polar bear ancestry maps by recovering a geographic signal within bears that mirrors what is seen in SNP data. Finally, we demonstrate that AD-LIBS is more effective than HAPMIX at inferring ancestry when preexisting phased reference data are unavailable and genomes are sequenced to low coverage.ConclusionsAD-LIBS is an effective tool for ancestry inference that can be used even when few individuals are available for comparison or when genomes are sequenced to low coverage. AD-LIBS is therefore likely to be useful in studies of non-model or ancient organisms that lack large amounts of genomic DNA. AD-LIBS can therefore expand the range of studies in which admixture mapping is a viable tool. Article in Journal/Newspaper polar bear University of California: eScholarship
institution Open Polar
collection University of California: eScholarship
op_collection_id ftcdlib
language unknown
topic Animals
Ursidae
Humans
Markov Chains
Hybridization
Genetic
Polymorphism
Single Nucleotide
Genome
Software
Genetic Speciation
Admixture
Ancestry
Bears
Speciation
Genetics
Human Genome
Biological Sciences
Information and Computing Sciences
Mathematical Sciences
Bioinformatics
spellingShingle Animals
Ursidae
Humans
Markov Chains
Hybridization
Genetic
Polymorphism
Single Nucleotide
Genome
Software
Genetic Speciation
Admixture
Ancestry
Bears
Speciation
Genetics
Human Genome
Biological Sciences
Information and Computing Sciences
Mathematical Sciences
Bioinformatics
Schaefer, Nathan K
Shapiro, Beth
Green, Richard E
AD-LIBS: inferring ancestry across hybrid genomes using low-coverage sequence data.
topic_facet Animals
Ursidae
Humans
Markov Chains
Hybridization
Genetic
Polymorphism
Single Nucleotide
Genome
Software
Genetic Speciation
Admixture
Ancestry
Bears
Speciation
Genetics
Human Genome
Biological Sciences
Information and Computing Sciences
Mathematical Sciences
Bioinformatics
description BackgroundInferring the ancestry of each region of admixed individuals' genomes is useful in studies ranging from disease gene mapping to speciation genetics. Current methods require high-coverage genotype data and phased reference panels, and are therefore inappropriate for many data sets. We present a software application, AD-LIBS, that uses a hidden Markov model to infer ancestry across hybrid genomes without requiring variant calling or phasing. This approach is useful for non-model organisms and in cases of low-coverage data, such as ancient DNA.ResultsWe demonstrate the utility of AD-LIBS with synthetic data. We then use AD-LIBS to infer ancestry in two published data sets: European human genomes with Neanderthal ancestry and brown bear genomes with polar bear ancestry. AD-LIBS correctly infers 87-91% of ancestry in simulations and produces ancestry maps that agree with published results and global ancestry estimates in humans. In brown bears, we find more polar bear ancestry than has been published previously, using both AD-LIBS and an existing software application for local ancestry inference, HAPMIX. We validate AD-LIBS polar bear ancestry maps by recovering a geographic signal within bears that mirrors what is seen in SNP data. Finally, we demonstrate that AD-LIBS is more effective than HAPMIX at inferring ancestry when preexisting phased reference data are unavailable and genomes are sequenced to low coverage.ConclusionsAD-LIBS is an effective tool for ancestry inference that can be used even when few individuals are available for comparison or when genomes are sequenced to low coverage. AD-LIBS is therefore likely to be useful in studies of non-model or ancient organisms that lack large amounts of genomic DNA. AD-LIBS can therefore expand the range of studies in which admixture mapping is a viable tool.
format Article in Journal/Newspaper
author Schaefer, Nathan K
Shapiro, Beth
Green, Richard E
author_facet Schaefer, Nathan K
Shapiro, Beth
Green, Richard E
author_sort Schaefer, Nathan K
title AD-LIBS: inferring ancestry across hybrid genomes using low-coverage sequence data.
title_short AD-LIBS: inferring ancestry across hybrid genomes using low-coverage sequence data.
title_full AD-LIBS: inferring ancestry across hybrid genomes using low-coverage sequence data.
title_fullStr AD-LIBS: inferring ancestry across hybrid genomes using low-coverage sequence data.
title_full_unstemmed AD-LIBS: inferring ancestry across hybrid genomes using low-coverage sequence data.
title_sort ad-libs: inferring ancestry across hybrid genomes using low-coverage sequence data.
publisher eScholarship, University of California
publishDate 2017
url https://escholarship.org/uc/item/04j576r0
op_coverage 203
genre polar bear
genre_facet polar bear
op_source BMC bioinformatics, vol 18, iss 1
op_relation qt04j576r0
https://escholarship.org/uc/item/04j576r0
op_rights public
_version_ 1766171274714808320