Characterising copy number polymorphisms using next generation sequencing data

We developed a pipeline to identify the copy number polymorphisms (CNPs) in the Northern Swedish population using whole genome sequencing (WGS) data. Two different methodologies were applied to discover CNPs in more than 1,000 individuals. We also studied the association between the identified CNPs...

Full description

Bibliographic Details
Main Author: Li, Zhiwei
Format: Bachelor Thesis
Language:English
Published: Uppsala universitet, Institutionen för biologisk grundutbildning 2019
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-386050
Description
Summary:We developed a pipeline to identify the copy number polymorphisms (CNPs) in the Northern Swedish population using whole genome sequencing (WGS) data. Two different methodologies were applied to discover CNPs in more than 1,000 individuals. We also studied the association between the identified CNPs with the expression level of 438 plasma proteins collected in the same population. The identified CNPs were summarized and filtered as a population copy number matrix for 1,021 individuals in 243,987 non-overlapping CNP loci. For the 872 individuals with both WGS and plasma protein biomarkers data, we conducted linear regression analyses with age and sex as covariance. From the analyses, we detected 382 CNP loci, clustered in 30 collapsed copy number variable regions (CNVRs) that were significantly associated with the levels of 17 plasma protein biomarkers (p < 4.68×10-10).