Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores
Due to the high cost of DNA sequencing for large-scale data, I propose a two-phase design using polygenic risk scores (PRS) to inform selection of individuals in phase 1, followed by regional sequencing in a selected subsample in phase 2. Residual dependent sampling (RDS) design is implemented by re...
Main Author: | |
---|---|
Other Authors: | , , |
Format: | Thesis |
Language: | unknown |
Published: |
University of Toronto
2022
|
Subjects: | |
Online Access: | http://hdl.handle.net/1807/110846 |
id |
ftunivtoronto:oai:tspace.library.utoronto.ca:1807/110846 |
---|---|
record_format |
openpolar |
spelling |
ftunivtoronto:oai:tspace.library.utoronto.ca:1807/110846 2023-05-15T17:42:29+02:00 Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores Wang, Guan Bull, Shelley B Espin-Garcia, Osvaldo Dalla Lana School of Public Health 2022-03-23T15:39:47Z application/pdf http://hdl.handle.net/1807/110846 unknown University of Toronto http://hdl.handle.net/1807/110846 0308 Thesis 2022 ftunivtoronto 2022-03-27T17:23:14Z Due to the high cost of DNA sequencing for large-scale data, I propose a two-phase design using polygenic risk scores (PRS) to inform selection of individuals in phase 1, followed by regional sequencing in a selected subsample in phase 2. Residual dependent sampling (RDS) design is implemented by regressing the phenotype of interest on the PRS and selecting individuals with extreme residuals as the phase 2 subsample. Efficient analysis can be carried out under semi-parametric modelling by the EM algorithm. A fine-mapping application in a genome-wide association study (GWAS) of triglyceride levels in 4, 504 individuals from the Northern Finland Birth Cohort of 1966 shows the proposed method can reduce sequencing costs in post-GWAS analyses while maintaining statistical performance. Simulation studies show that the proposed RDS design gives more precise estimation than simple random sampling, with adequate type one error control, while performing more similarly to the complete sample. M.Sc. Thesis Northern Finland University of Toronto: Research Repository T-Space |
institution |
Open Polar |
collection |
University of Toronto: Research Repository T-Space |
op_collection_id |
ftunivtoronto |
language |
unknown |
topic |
0308 |
spellingShingle |
0308 Wang, Guan Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores |
topic_facet |
0308 |
description |
Due to the high cost of DNA sequencing for large-scale data, I propose a two-phase design using polygenic risk scores (PRS) to inform selection of individuals in phase 1, followed by regional sequencing in a selected subsample in phase 2. Residual dependent sampling (RDS) design is implemented by regressing the phenotype of interest on the PRS and selecting individuals with extreme residuals as the phase 2 subsample. Efficient analysis can be carried out under semi-parametric modelling by the EM algorithm. A fine-mapping application in a genome-wide association study (GWAS) of triglyceride levels in 4, 504 individuals from the Northern Finland Birth Cohort of 1966 shows the proposed method can reduce sequencing costs in post-GWAS analyses while maintaining statistical performance. Simulation studies show that the proposed RDS design gives more precise estimation than simple random sampling, with adequate type one error control, while performing more similarly to the complete sample. M.Sc. |
author2 |
Bull, Shelley B Espin-Garcia, Osvaldo Dalla Lana School of Public Health |
format |
Thesis |
author |
Wang, Guan |
author_facet |
Wang, Guan |
author_sort |
Wang, Guan |
title |
Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores |
title_short |
Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores |
title_full |
Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores |
title_fullStr |
Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores |
title_full_unstemmed |
Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores |
title_sort |
two-phase design for regional genetic sequencing using polygenic risk scores |
publisher |
University of Toronto |
publishDate |
2022 |
url |
http://hdl.handle.net/1807/110846 |
genre |
Northern Finland |
genre_facet |
Northern Finland |
op_relation |
http://hdl.handle.net/1807/110846 |
_version_ |
1766144360350482432 |