Double Machine Learning and Automated Confounder Selection -- A Cautionary Tale
Double machine learning (DML) is becoming an increasingly popular tool for automatic model selection in high-dimensional settings. These approaches rely on the assumption of conditional independence, which may not hold in big-data settings where the covariate space is large. This paper shows that DM...
Main Authors: | , , |
---|---|
Format: | Text |
Language: | unknown |
Published: |
2021
|
Subjects: | |
Online Access: | http://arxiv.org/abs/2108.11294 |
id |
fttriple:oai:gotriple.eu:10670/1.n5q8jt |
---|---|
record_format |
openpolar |
spelling |
fttriple:oai:gotriple.eu:10670/1.n5q8jt 2023-05-15T16:01:23+02:00 Double Machine Learning and Automated Confounder Selection -- A Cautionary Tale Hünermund, Paul Louw, Beyers Caspi, Itamar 2021-08-25 http://arxiv.org/abs/2108.11294 undefined unknown 10670/1.n5q8jt http://arxiv.org/abs/2108.11294 undefined arXiv stat psy Text https://vocabularies.coar-repositories.org/resource_types/c_18cf/ 2021 fttriple 2023-01-22T18:13:57Z Double machine learning (DML) is becoming an increasingly popular tool for automatic model selection in high-dimensional settings. These approaches rely on the assumption of conditional independence, which may not hold in big-data settings where the covariate space is large. This paper shows that DML is very sensitive to the inclusion of even a few "bad controls" in the covariate space. The resulting bias varies with the nature of the causal model, which raises concerns about the feasibility of selecting control variables in a data-driven way. Text DML Unknown |
institution |
Open Polar |
collection |
Unknown |
op_collection_id |
fttriple |
language |
unknown |
topic |
stat psy |
spellingShingle |
stat psy Hünermund, Paul Louw, Beyers Caspi, Itamar Double Machine Learning and Automated Confounder Selection -- A Cautionary Tale |
topic_facet |
stat psy |
description |
Double machine learning (DML) is becoming an increasingly popular tool for automatic model selection in high-dimensional settings. These approaches rely on the assumption of conditional independence, which may not hold in big-data settings where the covariate space is large. This paper shows that DML is very sensitive to the inclusion of even a few "bad controls" in the covariate space. The resulting bias varies with the nature of the causal model, which raises concerns about the feasibility of selecting control variables in a data-driven way. |
format |
Text |
author |
Hünermund, Paul Louw, Beyers Caspi, Itamar |
author_facet |
Hünermund, Paul Louw, Beyers Caspi, Itamar |
author_sort |
Hünermund, Paul |
title |
Double Machine Learning and Automated Confounder Selection -- A Cautionary Tale |
title_short |
Double Machine Learning and Automated Confounder Selection -- A Cautionary Tale |
title_full |
Double Machine Learning and Automated Confounder Selection -- A Cautionary Tale |
title_fullStr |
Double Machine Learning and Automated Confounder Selection -- A Cautionary Tale |
title_full_unstemmed |
Double Machine Learning and Automated Confounder Selection -- A Cautionary Tale |
title_sort |
double machine learning and automated confounder selection -- a cautionary tale |
publishDate |
2021 |
url |
http://arxiv.org/abs/2108.11294 |
genre |
DML |
genre_facet |
DML |
op_source |
arXiv |
op_relation |
10670/1.n5q8jt http://arxiv.org/abs/2108.11294 |
op_rights |
undefined |
_version_ |
1766397270015606784 |