Efficient clustering of identity-by-descent between multiple individuals

Abstract Motivation: Most existing identity-by-descent (IBD) detection methods only consider haplotype pairs; less attention has been paid to considering multiple haplotypes simultaneously, even though IBD is an equivalence relation on haplotypes that partitions a set of haplotypes into IBD clusters...

Full description

Bibliographic Details
Published in:	Bioinformatics
Main Authors:	Qian, Yu, Browning, Brian L., Browning, Sharon R.
Format:	Article in Journal/Newspaper
Language:	English
Published:	Oxford University Press (OUP) 2013
Subjects:	Computational Mathematics Computational Theory and Mathematics Computer Science Applications Molecular Biology Biochemistry Statistics and Probability Northern Finland
Online Access:	http://dx.doi.org/10.1093/bioinformatics/btt734 https://academic.oup.com/bioinformatics/article-pdf/30/7/915/48921782/bioinformatics_30_7_915.pdf

id	croxfordunivpr:10.1093/bioinformatics/btt734
record_format	openpolar
spelling	croxfordunivpr:10.1093/bioinformatics/btt734 2024-04-28T08:32:25+00:00 Efficient clustering of identity-by-descent between multiple individuals Qian, Yu Browning, Brian L. Browning, Sharon R. 2013 http://dx.doi.org/10.1093/bioinformatics/btt734 https://academic.oup.com/bioinformatics/article-pdf/30/7/915/48921782/bioinformatics_30_7_915.pdf en eng Oxford University Press (OUP) Bioinformatics volume 30, issue 7, page 915-922 ISSN 1367-4811 1367-4803 Computational Mathematics Computational Theory and Mathematics Computer Science Applications Molecular Biology Biochemistry Statistics and Probability Computational Mathematics Computational Theory and Mathematics Computer Science Applications Molecular Biology Biochemistry Statistics and Probability journal-article 2013 croxfordunivpr https://doi.org/10.1093/bioinformatics/btt734 2024-04-09T07:57:17Z Abstract Motivation: Most existing identity-by-descent (IBD) detection methods only consider haplotype pairs; less attention has been paid to considering multiple haplotypes simultaneously, even though IBD is an equivalence relation on haplotypes that partitions a set of haplotypes into IBD clusters. Multiple-haplotype IBD clusters may have advantages over pairwise IBD in some applications, such as IBD mapping. Existing methods for detecting multiple-haplotype IBD clusters are often computationally expensive and unable to handle large samples with thousands of haplotypes. Results: We present a clustering method, efficient multiple-IBD, which uses pairwise IBD segments to infer multiple-haplotype IBD clusters. It expands clusters from seed haplotypes by adding qualified neighbors and extends clusters across sliding windows in the genome. Our method is an order of magnitude faster than existing methods and has comparable performance with respect to the quality of clusters it uncovers. We further investigate the potential application of multiple-haplotype IBD clusters in association studies by testing for association between multiple-haplotype IBD clusters and low-density lipoprotein cholesterol in the Northern Finland Birth Cohort. Using our multiple-haplotype IBD cluster approach, we found an association with a genomic interval covering the PCSK9 gene in these data that is missed by standard single-marker association tests. Previously published studies confirm association of PCSK9 with low-density lipoprotein. Availability and implementation: Source code is available under the GNU Public License http://cs.au.dk/~qianyuxx/EMI/. Contact: qianyuxx@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. Article in Journal/Newspaper Northern Finland Oxford University Press Bioinformatics 30 7 915 922
institution	Open Polar
collection	Oxford University Press
op_collection_id	croxfordunivpr
language	English
topic	Computational Mathematics Computational Theory and Mathematics Computer Science Applications Molecular Biology Biochemistry Statistics and Probability Computational Mathematics Computational Theory and Mathematics Computer Science Applications Molecular Biology Biochemistry Statistics and Probability
spellingShingle	Computational Mathematics Computational Theory and Mathematics Computer Science Applications Molecular Biology Biochemistry Statistics and Probability Computational Mathematics Computational Theory and Mathematics Computer Science Applications Molecular Biology Biochemistry Statistics and Probability Qian, Yu Browning, Brian L. Browning, Sharon R. Efficient clustering of identity-by-descent between multiple individuals
topic_facet	Computational Mathematics Computational Theory and Mathematics Computer Science Applications Molecular Biology Biochemistry Statistics and Probability Computational Mathematics Computational Theory and Mathematics Computer Science Applications Molecular Biology Biochemistry Statistics and Probability
description	Abstract Motivation: Most existing identity-by-descent (IBD) detection methods only consider haplotype pairs; less attention has been paid to considering multiple haplotypes simultaneously, even though IBD is an equivalence relation on haplotypes that partitions a set of haplotypes into IBD clusters. Multiple-haplotype IBD clusters may have advantages over pairwise IBD in some applications, such as IBD mapping. Existing methods for detecting multiple-haplotype IBD clusters are often computationally expensive and unable to handle large samples with thousands of haplotypes. Results: We present a clustering method, efficient multiple-IBD, which uses pairwise IBD segments to infer multiple-haplotype IBD clusters. It expands clusters from seed haplotypes by adding qualified neighbors and extends clusters across sliding windows in the genome. Our method is an order of magnitude faster than existing methods and has comparable performance with respect to the quality of clusters it uncovers. We further investigate the potential application of multiple-haplotype IBD clusters in association studies by testing for association between multiple-haplotype IBD clusters and low-density lipoprotein cholesterol in the Northern Finland Birth Cohort. Using our multiple-haplotype IBD cluster approach, we found an association with a genomic interval covering the PCSK9 gene in these data that is missed by standard single-marker association tests. Previously published studies confirm association of PCSK9 with low-density lipoprotein. Availability and implementation: Source code is available under the GNU Public License http://cs.au.dk/~qianyuxx/EMI/. Contact: qianyuxx@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online.
format	Article in Journal/Newspaper
author	Qian, Yu Browning, Brian L. Browning, Sharon R.
author_facet	Qian, Yu Browning, Brian L. Browning, Sharon R.
author_sort	Qian, Yu
title	Efficient clustering of identity-by-descent between multiple individuals
title_short	Efficient clustering of identity-by-descent between multiple individuals
title_full	Efficient clustering of identity-by-descent between multiple individuals
title_fullStr	Efficient clustering of identity-by-descent between multiple individuals
title_full_unstemmed	Efficient clustering of identity-by-descent between multiple individuals
title_sort	efficient clustering of identity-by-descent between multiple individuals
publisher	Oxford University Press (OUP)
publishDate	2013
url	http://dx.doi.org/10.1093/bioinformatics/btt734 https://academic.oup.com/bioinformatics/article-pdf/30/7/915/48921782/bioinformatics_30_7_915.pdf
genre	Northern Finland
genre_facet	Northern Finland
op_source	Bioinformatics volume 30, issue 7, page 915-922 ISSN 1367-4811 1367-4803
op_doi	https://doi.org/10.1093/bioinformatics/btt734
container_title	Bioinformatics
container_volume	30
container_issue	7
container_start_page	915
op_container_end_page	922
_version_	1797589622790815744

Efficient clustering of identity-by-descent between multiple individuals

Similar Items