Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores

Due to the high cost of DNA sequencing for large-scale data, I propose a two-phase design using polygenic risk scores (PRS) to inform selection of individuals in phase 1, followed by regional sequencing in a selected subsample in phase 2. Residual dependent sampling (RDS) design is implemented by re...

Full description

Bibliographic Details
Main Author: Wang, Guan
Other Authors: Bull, Shelley B, Espin-Garcia, Osvaldo, Dalla Lana School of Public Health
Format: Thesis
Language:unknown
Published: University of Toronto 2022
Subjects:
Online Access:http://hdl.handle.net/1807/110846
id ftunivtoronto:oai:tspace.library.utoronto.ca:1807/110846
record_format openpolar
spelling ftunivtoronto:oai:tspace.library.utoronto.ca:1807/110846 2023-05-15T17:42:29+02:00 Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores Wang, Guan Bull, Shelley B Espin-Garcia, Osvaldo Dalla Lana School of Public Health 2022-03-23T15:39:47Z application/pdf http://hdl.handle.net/1807/110846 unknown University of Toronto http://hdl.handle.net/1807/110846 0308 Thesis 2022 ftunivtoronto 2022-03-27T17:23:14Z Due to the high cost of DNA sequencing for large-scale data, I propose a two-phase design using polygenic risk scores (PRS) to inform selection of individuals in phase 1, followed by regional sequencing in a selected subsample in phase 2. Residual dependent sampling (RDS) design is implemented by regressing the phenotype of interest on the PRS and selecting individuals with extreme residuals as the phase 2 subsample. Efficient analysis can be carried out under semi-parametric modelling by the EM algorithm. A fine-mapping application in a genome-wide association study (GWAS) of triglyceride levels in 4, 504 individuals from the Northern Finland Birth Cohort of 1966 shows the proposed method can reduce sequencing costs in post-GWAS analyses while maintaining statistical performance. Simulation studies show that the proposed RDS design gives more precise estimation than simple random sampling, with adequate type one error control, while performing more similarly to the complete sample. M.Sc. Thesis Northern Finland University of Toronto: Research Repository T-Space
institution Open Polar
collection University of Toronto: Research Repository T-Space
op_collection_id ftunivtoronto
language unknown
topic 0308
spellingShingle 0308
Wang, Guan
Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores
topic_facet 0308
description Due to the high cost of DNA sequencing for large-scale data, I propose a two-phase design using polygenic risk scores (PRS) to inform selection of individuals in phase 1, followed by regional sequencing in a selected subsample in phase 2. Residual dependent sampling (RDS) design is implemented by regressing the phenotype of interest on the PRS and selecting individuals with extreme residuals as the phase 2 subsample. Efficient analysis can be carried out under semi-parametric modelling by the EM algorithm. A fine-mapping application in a genome-wide association study (GWAS) of triglyceride levels in 4, 504 individuals from the Northern Finland Birth Cohort of 1966 shows the proposed method can reduce sequencing costs in post-GWAS analyses while maintaining statistical performance. Simulation studies show that the proposed RDS design gives more precise estimation than simple random sampling, with adequate type one error control, while performing more similarly to the complete sample. M.Sc.
author2 Bull, Shelley B
Espin-Garcia, Osvaldo
Dalla Lana School of Public Health
format Thesis
author Wang, Guan
author_facet Wang, Guan
author_sort Wang, Guan
title Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores
title_short Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores
title_full Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores
title_fullStr Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores
title_full_unstemmed Two-phase Design for Regional Genetic Sequencing using Polygenic Risk Scores
title_sort two-phase design for regional genetic sequencing using polygenic risk scores
publisher University of Toronto
publishDate 2022
url http://hdl.handle.net/1807/110846
genre Northern Finland
genre_facet Northern Finland
op_relation http://hdl.handle.net/1807/110846
_version_ 1766144360350482432