Calling differentially methylated regions from whole genome bisulphite sequencing with DMRcate

Whole genome bisulphite sequencing (WGBS) permits the genome-wide study of single molecule methylation patterns. One of the key goals of mammalian cell-type identity studies, in both normal differentiation and disease, is to locate differential methylation patterns across the genome. We discuss the...

Full description

Bibliographic Details
Published in:Nucleic Acids Research
Main Authors: Peters, Timothy J, Buckley, Michael J, Chen, Yunshun, Smyth, Gordon K, Goodnow, Christopher C, Clark, Susan J
Format: Text
Language:English
Published: Oxford University Press 2021
Subjects:
DML
Online Access:http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8565305/
http://www.ncbi.nlm.nih.gov/pubmed/34320181
https://doi.org/10.1093/nar/gkab637
Description
Summary:Whole genome bisulphite sequencing (WGBS) permits the genome-wide study of single molecule methylation patterns. One of the key goals of mammalian cell-type identity studies, in both normal differentiation and disease, is to locate differential methylation patterns across the genome. We discuss the most desirable characteristics for DML (differentially methylated locus) and DMR (differentially methylated region) detection tools in a genome-wide context and choose a set of statistical methods that fully or partially satisfy these considerations to compare for benchmarking. Our data simulation strategy is both biologically informed—employing distribution parameters derived from large-scale consortium datasets—and thorough. We report DML detection ability with respect to coverage, group methylation difference, sample size, variability and covariate size, both marginally and jointly, and exhaustively with respect to parameter combination. We also benchmark these methods on FDR control and computational time. We use this result to backend and introduce an expanded version of DMRcate: an existing DMR detection tool for microarray data that we have extended to now call DMRs from WGBS data. We compare DMRcate to a set of alternative DMR callers using a similarly realistic simulation strategy. We find DMRcate and RADmeth are the best predictors of DMRs, and conclusively find DMRcate the fastest.