Novel probabilistic models of spatial genetic ancestry with applications to stratification correction in genome-wide association studies

Genetic variation in human populations is influenced by geographic ancestry due to spatial locality in historical mating and migration patterns. Spatial population structure in genetic datasets has been traditionally analyzed using either model-free algorithms, such as principal components analysis...

Full description

Bibliographic Details
Main Authors: Bhaskar, Anand, Javanmard, Adel, Courtade, Thomas A., Tse, David
Format: Report
Language:unknown
Published: arXiv 2016
Subjects:
Online Access:https://dx.doi.org/10.48550/arxiv.1610.07306
https://arxiv.org/abs/1610.07306
id ftdatacite:10.48550/arxiv.1610.07306
record_format openpolar
spelling ftdatacite:10.48550/arxiv.1610.07306 2023-05-15T17:42:45+02:00 Novel probabilistic models of spatial genetic ancestry with applications to stratification correction in genome-wide association studies Bhaskar, Anand Javanmard, Adel Courtade, Thomas A. Tse, David 2016 https://dx.doi.org/10.48550/arxiv.1610.07306 https://arxiv.org/abs/1610.07306 unknown arXiv arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/ Populations and Evolution q-bio.PE Methodology stat.ME FOS Biological sciences FOS Computer and information sciences Preprint Article article CreativeWork 2016 ftdatacite https://doi.org/10.48550/arxiv.1610.07306 2022-04-01T11:14:04Z Genetic variation in human populations is influenced by geographic ancestry due to spatial locality in historical mating and migration patterns. Spatial population structure in genetic datasets has been traditionally analyzed using either model-free algorithms, such as principal components analysis (PCA) and multidimensional scaling, or using explicit spatial probabilistic models of allele frequency evolution. We develop a general probabilistic model and an associated inference algorithm that unify the model-based and data-driven approaches to visualizing and inferring population structure. Our algorithm, Geographic Ancestry Positioning (GAP), relates local genetic distances between samples to their spatial distances, and can be used for visually discerning population structure as well as accurately inferring the spatial origin of individuals on a two-dimensional continuum. On both simulated and several real datasets from diverse human populations, GAP exhibits substantially lower error in reconstructing spatial ancestry coordinates compared to PCA. Our spatial inference algorithm can also be effectively applied to the problem of population stratification in genome-wide association studies (GWAS), where hidden population structure can create fictitious associations when population ancestry is correlated with both the genotype and the trait. We develop an association test that uses the ancestry coordinates inferred by GAP to accurately account for ancestry-induced correlations in GWAS. Based on simulations and analysis of a dataset of 10 metabolic traits measured in a Northern Finland cohort, which is known to exhibit significant population structure, we find that our method has superior power to current approaches. : Supplementary information included to the main text Report Northern Finland DataCite Metadata Store (German National Library of Science and Technology)
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language unknown
topic Populations and Evolution q-bio.PE
Methodology stat.ME
FOS Biological sciences
FOS Computer and information sciences
spellingShingle Populations and Evolution q-bio.PE
Methodology stat.ME
FOS Biological sciences
FOS Computer and information sciences
Bhaskar, Anand
Javanmard, Adel
Courtade, Thomas A.
Tse, David
Novel probabilistic models of spatial genetic ancestry with applications to stratification correction in genome-wide association studies
topic_facet Populations and Evolution q-bio.PE
Methodology stat.ME
FOS Biological sciences
FOS Computer and information sciences
description Genetic variation in human populations is influenced by geographic ancestry due to spatial locality in historical mating and migration patterns. Spatial population structure in genetic datasets has been traditionally analyzed using either model-free algorithms, such as principal components analysis (PCA) and multidimensional scaling, or using explicit spatial probabilistic models of allele frequency evolution. We develop a general probabilistic model and an associated inference algorithm that unify the model-based and data-driven approaches to visualizing and inferring population structure. Our algorithm, Geographic Ancestry Positioning (GAP), relates local genetic distances between samples to their spatial distances, and can be used for visually discerning population structure as well as accurately inferring the spatial origin of individuals on a two-dimensional continuum. On both simulated and several real datasets from diverse human populations, GAP exhibits substantially lower error in reconstructing spatial ancestry coordinates compared to PCA. Our spatial inference algorithm can also be effectively applied to the problem of population stratification in genome-wide association studies (GWAS), where hidden population structure can create fictitious associations when population ancestry is correlated with both the genotype and the trait. We develop an association test that uses the ancestry coordinates inferred by GAP to accurately account for ancestry-induced correlations in GWAS. Based on simulations and analysis of a dataset of 10 metabolic traits measured in a Northern Finland cohort, which is known to exhibit significant population structure, we find that our method has superior power to current approaches. : Supplementary information included to the main text
format Report
author Bhaskar, Anand
Javanmard, Adel
Courtade, Thomas A.
Tse, David
author_facet Bhaskar, Anand
Javanmard, Adel
Courtade, Thomas A.
Tse, David
author_sort Bhaskar, Anand
title Novel probabilistic models of spatial genetic ancestry with applications to stratification correction in genome-wide association studies
title_short Novel probabilistic models of spatial genetic ancestry with applications to stratification correction in genome-wide association studies
title_full Novel probabilistic models of spatial genetic ancestry with applications to stratification correction in genome-wide association studies
title_fullStr Novel probabilistic models of spatial genetic ancestry with applications to stratification correction in genome-wide association studies
title_full_unstemmed Novel probabilistic models of spatial genetic ancestry with applications to stratification correction in genome-wide association studies
title_sort novel probabilistic models of spatial genetic ancestry with applications to stratification correction in genome-wide association studies
publisher arXiv
publishDate 2016
url https://dx.doi.org/10.48550/arxiv.1610.07306
https://arxiv.org/abs/1610.07306
genre Northern Finland
genre_facet Northern Finland
op_rights arXiv.org perpetual, non-exclusive license
http://arxiv.org/licenses/nonexclusive-distrib/1.0/
op_doi https://doi.org/10.48550/arxiv.1610.07306
_version_ 1766144658244632576