Fine-Scale Estimation of Location of Birth from Genome-Wide Single-Nucleotide Polymorphism Data

Abstract Systematic nonrandom mating in populations results in genetic stratification and is predominantly caused by geographic separation, providing the opportunity to infer individuals’ birthplace from genetic data. Such inference has been demonstrated for individuals’ country of birth, but here w...

Full description

Bibliographic Details
Published in:Genetics
Main Authors: Hoggart, Clive J, O’Reilly, Paul F, Kaakinen, Marika, Zhang, Weihua, Chambers, John C, Kooner, Jaspal S, Coin, Lachlan J M, Jarvelin, Marjo-Riitta
Format: Article in Journal/Newspaper
Language:English
Published: Oxford University Press (OUP) 2012
Subjects:
Online Access:http://dx.doi.org/10.1534/genetics.111.135657
https://academic.oup.com/genetics/article-pdf/190/2/669/49438961/genetics0669.pdf
id croxfordunivpr:10.1534/genetics.111.135657
record_format openpolar
spelling croxfordunivpr:10.1534/genetics.111.135657 2023-05-15T17:42:41+02:00 Fine-Scale Estimation of Location of Birth from Genome-Wide Single-Nucleotide Polymorphism Data Hoggart, Clive J O’Reilly, Paul F Kaakinen, Marika Zhang, Weihua Chambers, John C Kooner, Jaspal S Coin, Lachlan J M Jarvelin, Marjo-Riitta 2012 http://dx.doi.org/10.1534/genetics.111.135657 https://academic.oup.com/genetics/article-pdf/190/2/669/49438961/genetics0669.pdf en eng Oxford University Press (OUP) https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model Genetics volume 190, issue 2, page 669-677 ISSN 1943-2631 Genetics journal-article 2012 croxfordunivpr https://doi.org/10.1534/genetics.111.135657 2023-03-10T11:00:37Z Abstract Systematic nonrandom mating in populations results in genetic stratification and is predominantly caused by geographic separation, providing the opportunity to infer individuals’ birthplace from genetic data. Such inference has been demonstrated for individuals’ country of birth, but here we use data from the Northern Finland Birth Cohort 1966 (NFBC1966) to investigate the characteristics of genetic structure within a population and subsequently develop a method for inferring location to a finer scale. Principal component analysis (PCA) shows that while the first PCs are particularly informative for location, there is also location information in the higher-order PCs, but it cannot be captured by a linear model. We introduce a new method, pcLOCATE, which is able to exploit this information to improve the accuracy of location inference. pcLOCATE uses individuals’ PC values to estimate the probability of birth in each town and then averages over all towns to give an estimated longitude and latitude of birth using a fully Bayesian model. We apply pcLOCATE to the NFBC1966 data to estimate parental birthplace, testing with successively more PCs and finding the model with the top 23 PCs most accurate, with a median distance of 23 km between the estimated and the true location. pcLOCATE predicts the most recent residence of NFBC1966 individuals to a median distance of 47 km. We also apply pcLOCATE to Indian individuals from the London Life Sciences Prospective Population Study (LOLIPOP) data, and find that birthplace is predicated to a median distance of 54 km from the true location. A method with such accuracy is potentially valuable in population genetics and forensics. Article in Journal/Newspaper Northern Finland Oxford University Press (via Crossref) Indian Genetics 190 2 669 677
institution Open Polar
collection Oxford University Press (via Crossref)
op_collection_id croxfordunivpr
language English
topic Genetics
spellingShingle Genetics
Hoggart, Clive J
O’Reilly, Paul F
Kaakinen, Marika
Zhang, Weihua
Chambers, John C
Kooner, Jaspal S
Coin, Lachlan J M
Jarvelin, Marjo-Riitta
Fine-Scale Estimation of Location of Birth from Genome-Wide Single-Nucleotide Polymorphism Data
topic_facet Genetics
description Abstract Systematic nonrandom mating in populations results in genetic stratification and is predominantly caused by geographic separation, providing the opportunity to infer individuals’ birthplace from genetic data. Such inference has been demonstrated for individuals’ country of birth, but here we use data from the Northern Finland Birth Cohort 1966 (NFBC1966) to investigate the characteristics of genetic structure within a population and subsequently develop a method for inferring location to a finer scale. Principal component analysis (PCA) shows that while the first PCs are particularly informative for location, there is also location information in the higher-order PCs, but it cannot be captured by a linear model. We introduce a new method, pcLOCATE, which is able to exploit this information to improve the accuracy of location inference. pcLOCATE uses individuals’ PC values to estimate the probability of birth in each town and then averages over all towns to give an estimated longitude and latitude of birth using a fully Bayesian model. We apply pcLOCATE to the NFBC1966 data to estimate parental birthplace, testing with successively more PCs and finding the model with the top 23 PCs most accurate, with a median distance of 23 km between the estimated and the true location. pcLOCATE predicts the most recent residence of NFBC1966 individuals to a median distance of 47 km. We also apply pcLOCATE to Indian individuals from the London Life Sciences Prospective Population Study (LOLIPOP) data, and find that birthplace is predicated to a median distance of 54 km from the true location. A method with such accuracy is potentially valuable in population genetics and forensics.
format Article in Journal/Newspaper
author Hoggart, Clive J
O’Reilly, Paul F
Kaakinen, Marika
Zhang, Weihua
Chambers, John C
Kooner, Jaspal S
Coin, Lachlan J M
Jarvelin, Marjo-Riitta
author_facet Hoggart, Clive J
O’Reilly, Paul F
Kaakinen, Marika
Zhang, Weihua
Chambers, John C
Kooner, Jaspal S
Coin, Lachlan J M
Jarvelin, Marjo-Riitta
author_sort Hoggart, Clive J
title Fine-Scale Estimation of Location of Birth from Genome-Wide Single-Nucleotide Polymorphism Data
title_short Fine-Scale Estimation of Location of Birth from Genome-Wide Single-Nucleotide Polymorphism Data
title_full Fine-Scale Estimation of Location of Birth from Genome-Wide Single-Nucleotide Polymorphism Data
title_fullStr Fine-Scale Estimation of Location of Birth from Genome-Wide Single-Nucleotide Polymorphism Data
title_full_unstemmed Fine-Scale Estimation of Location of Birth from Genome-Wide Single-Nucleotide Polymorphism Data
title_sort fine-scale estimation of location of birth from genome-wide single-nucleotide polymorphism data
publisher Oxford University Press (OUP)
publishDate 2012
url http://dx.doi.org/10.1534/genetics.111.135657
https://academic.oup.com/genetics/article-pdf/190/2/669/49438961/genetics0669.pdf
geographic Indian
geographic_facet Indian
genre Northern Finland
genre_facet Northern Finland
op_source Genetics
volume 190, issue 2, page 669-677
ISSN 1943-2631
op_rights https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model
op_doi https://doi.org/10.1534/genetics.111.135657
container_title Genetics
container_volume 190
container_issue 2
container_start_page 669
op_container_end_page 677
_version_ 1766144592348971008