Inference of structure in subdivided populations at low levels of genetic differentiation—the correlated allele frequencies model revisited

Abstract Motivation: This article considers the problem of estimating population genetic subdivision from multilocus genotype data. A model is considered to make use of genotypes and possibly of spatial coordinates of sampled individuals. A particular attention is paid to the case of low genetic dif...

Full description

Bibliographic Details
Published in:Bioinformatics
Main Author: Guillot, Gilles
Format: Article in Journal/Newspaper
Language:English
Published: Oxford University Press (OUP) 2008
Subjects:
Online Access:http://dx.doi.org/10.1093/bioinformatics/btn419
https://academic.oup.com/bioinformatics/article-pdf/24/19/2222/49049897/bioinformatics_24_19_2222.pdf
Description
Summary:Abstract Motivation: This article considers the problem of estimating population genetic subdivision from multilocus genotype data. A model is considered to make use of genotypes and possibly of spatial coordinates of sampled individuals. A particular attention is paid to the case of low genetic differentiation with the help of a previously described Bayesian clustering model where allele frequencies are assumed to be a priori correlated. Under this model, various problems of inference are considered, in particular the common and difficult, but still unaddressed, situation where the number of populations is unknown. Results: A Markov chain Monte Carlo algorithm and a new post-processing scheme are proposed. It is shown that they significantly improve the accuracy of previously existing algorithms in terms of estimated number of populations and estimated population membership. This is illustrated numerically with data simulated from the prior-likelihood model used in inference and also with data simulated from a Wright–Fisher model. Improvements are also illustrated on a real dataset of eighty-eight wolverines (Gulo gulo) genotyped at 10 microsatellites loci. The interest of the solutions presented here are not specific to any clustering model and are hence relevant to many settings in populations genetics where weakly differentiated populations are assumed or sought. Availability: The improvements implemented will be made available in version 3.0.0 of the R package Geneland. Informations on how to get and use the software are available from http://folk.uio.no/gillesg/Geneland.html. Supplementary information: http://folk.uio.no/gillesg/CFM/SuppMat.pdf Contact: gilles.guillot@bio.uio.no