Double Machine Learning and Automated Confounder Selection -- A Cautionary Tale

Double machine learning (DML) is becoming an increasingly popular tool for automatic model selection in high-dimensional settings. These approaches rely on the assumption of conditional independence, which may not hold in big-data settings where the covariate space is large. This paper shows that DM...

Full description

Bibliographic Details
Main Authors:	Hünermund, Paul, Louw, Beyers, Caspi, Itamar
Format:	Text
Language:	unknown
Published:	2021
Subjects:	stat psy DML
Online Access:	http://arxiv.org/abs/2108.11294

Description
Summary:	Double machine learning (DML) is becoming an increasingly popular tool for automatic model selection in high-dimensional settings. These approaches rely on the assumption of conditional independence, which may not hold in big-data settings where the covariate space is large. This paper shows that DML is very sensitive to the inclusion of even a few "bad controls" in the covariate space. The resulting bias varies with the nature of the causal model, which raises concerns about the feasibility of selecting control variables in a data-driven way.

Double Machine Learning and Automated Confounder Selection -- A Cautionary Tale

Similar Items