Accurate Partition of Individuals Into Full-Sib Families From Genetic Data Without Parental Information

Abstract Two Markov chain Monte Carlo algorithms are proposed that allow the partitioning of individuals into full-sib groups using single-locus genetic marker data when no parental information is available. These algorithms present a method of moving through the sibship configuration space and loca...

Full description

Bibliographic Details
Published in:Genetics
Main Authors: Smith, Bruce R, Herbinger, Christophe M, Merry, Heather R
Format: Article in Journal/Newspaper
Language:English
Published: Oxford University Press (OUP) 2001
Subjects:
Online Access:http://dx.doi.org/10.1093/genetics/158.3.1329
https://academic.oup.com/genetics/article-pdf/158/3/1329/42033230/genetics1329.pdf
Description
Summary:Abstract Two Markov chain Monte Carlo algorithms are proposed that allow the partitioning of individuals into full-sib groups using single-locus genetic marker data when no parental information is available. These algorithms present a method of moving through the sibship configuration space and locating the configuration that maximizes an overall score on the basis of pairwise likelihood ratios of being full-sib or unrelated or maximizes the full joint likelihood of the proposed family structure. Using these methods, up to 757 out of 759 Atlantic salmon were correctly classified into 12 full-sib families of unequal size using four microsatellite markers. Large-scale simulations were performed to assess the sensitivity of the procedures to the number of loci and number of alleles per locus, the allelic distribution type, the distribution of families, and the independent knowledge of population allelic frequencies. The number of loci and the number of alleles per locus had the most impact on accuracy. Very good accuracy can be obtained with as few as four loci when they have at least eight alleles. Accuracy decreases when using allelic frequencies estimated in small target samples with skewed family distributions with the pairwise likelihood approach. We present an iterative approach that partly corrects that problem. The full likelihood approach is less sensitive to the precision of allelic frequencies estimates but did not perform as well with the large data set or when little information was available (e.g., four loci with four alleles).