Machine learning-based DNA methylation score for fetal exposure to maternal smoking: development and validation in samples collected from adolescents and adults

Background: fetal exposure to maternal smoking during pregnancy is associated with the development of non-communicable diseases in the offspring. It may induce such long-term effects through persistent changes in the DNA-methylome, which therefore holds the potential to be used as a biomarker of thi...

Full description

Bibliographic Details
Main Authors: Rauschert, Sebastian, Melton, Phillip E., Heiskala, Anni, Karhunen, Ville, Burdge, Graham, Craig, Jeffrey M., Godfrey, Keith, Lillycrop, Karen, Mori, Trevor A., Beilin, Lawrence J, Oddy, Wendy H, Pennell, Craig, Järvelin, Marjo-Riitta, Sebert, Sylvain, Huang, Rae-Chi
Format: Article in Journal/Newspaper
Language:English
Published: 2020
Subjects:
Online Access:https://eprints.soton.ac.uk/443740/
https://eprints.soton.ac.uk/443740/1/Smoking_Score_Manuscript_Rauschert_final.docx
https://eprints.soton.ac.uk/443740/2/Figure_1.pdf
https://eprints.soton.ac.uk/443740/3/Figure_2.pdf
https://eprints.soton.ac.uk/443740/4/Supplement_S1_revised_new_Rauschert.docx
https://eprints.soton.ac.uk/443740/5/Copy_of_Supplement_S2_revised.xlsx
Description
Summary:Background: fetal exposure to maternal smoking during pregnancy is associated with the development of non-communicable diseases in the offspring. It may induce such long-term effects through persistent changes in the DNA-methylome, which therefore holds the potential to be used as a biomarker of this early life exposure. With reducing costs for measuring DNA-methylation, we aimed to develop a DNA-methylation score that can be used on adolescent DNA-methylation data and thereby generate a score for in utero smoke exposure. Methods: we used machine learning methods to create a score reflecting exposure to maternal smoking during pregnancy. This score is based on peripheral blood measurements of DNA methylation (Illumina’s Infinium HumanMethylation450 BeadChip). The score was developed and tested in the Raine Study with data from 995 Caucasian 17y-old participants using 10-fold cross-validation. The score was further tested and validated in independent data from the Northern Finland Birth Cohort 1986 (NFBC1986) (16y-olds) and 1966 (NFBC1966) (31y-olds). Further, three previously proposed DNA methylation scores were applied for comparison. The final score was developed with 204 CpGs using elastic net regression. Results: sensitivity and specificity values for the best performing previously developed classifier (‘Reese Score’) were 88% and 72% for Raine, 87% and 61% for NFBC1986 and 72% and 70% for NFBC1966, respectively; corresponding figures utilizing the elastic net regression approach were 91% and 76% (Raine), 87% and 75% (NFBC1986) as well as 72% and 78% for NFBC1966. Conclusion: we have developed a DNA methylation score for exposure to maternal smoking during pregnancy, outperforming the three previously developed scores. One possible application of the current score could be for model adjustment purposes or to assess its association with distal health outcomes where part of the effect can be attributed to maternal smoking. Further, it may provide a biomarker for fetal exposure to maternal smoking.