Calling differentially methylated regions from whole genome bisulphite sequencing with DMRcate

Abstract Whole genome bisulphite sequencing (WGBS) permits the genome-wide study of single molecule methylation patterns. One of the key goals of mammalian cell-type identity studies, in both normal differentiation and disease, is to locate differential methylation patterns across the genome. We dis...

Full description

Bibliographic Details
Published in:Nucleic Acids Research
Main Authors: Peters, Timothy J, Buckley, Michael J, Chen, Yunshun, Smyth, Gordon K, Goodnow, Christopher C, Clark, Susan J
Other Authors: National Health and Medical Research Council, NHMRC, Bill & Patricia Ritchie Foundation
Format: Article in Journal/Newspaper
Language:English
Published: Oxford University Press (OUP) 2021
Subjects:
DML
Online Access:http://dx.doi.org/10.1093/nar/gkab637
http://academic.oup.com/nar/article-pdf/49/19/e109/41071508/gkab637.pdf
Description
Summary:Abstract Whole genome bisulphite sequencing (WGBS) permits the genome-wide study of single molecule methylation patterns. One of the key goals of mammalian cell-type identity studies, in both normal differentiation and disease, is to locate differential methylation patterns across the genome. We discuss the most desirable characteristics for DML (differentially methylated locus) and DMR (differentially methylated region) detection tools in a genome-wide context and choose a set of statistical methods that fully or partially satisfy these considerations to compare for benchmarking. Our data simulation strategy is both biologically informed—employing distribution parameters derived from large-scale consortium datasets—and thorough. We report DML detection ability with respect to coverage, group methylation difference, sample size, variability and covariate size, both marginally and jointly, and exhaustively with respect to parameter combination. We also benchmark these methods on FDR control and computational time. We use this result to backend and introduce an expanded version of DMRcate: an existing DMR detection tool for microarray data that we have extended to now call DMRs from WGBS data. We compare DMRcate to a set of alternative DMR callers using a similarly realistic simulation strategy. We find DMRcate and RADmeth are the best predictors of DMRs, and conclusively find DMRcate the fastest.