Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning
Summary In recent decades, the fields of statistical and machine learning have seen a revolution in the development of data-adaptive regression methods that have optimal performance under flexible, sometimes minimal, assumptions on the true regression functions. These developments have impacted all...
Published in: | Biostatistics |
---|---|
Main Author: | |
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
Oxford University Press (OUP)
2019
|
Subjects: | |
Online Access: | http://dx.doi.org/10.1093/biostatistics/kxz042 http://academic.oup.com/biostatistics/advance-article-pdf/doi/10.1093/biostatistics/kxz042/30988844/kxz042.pdf |
id |
croxfordunivpr:10.1093/biostatistics/kxz042 |
---|---|
record_format |
openpolar |
spelling |
croxfordunivpr:10.1093/biostatistics/kxz042 2024-09-15T18:03:52+00:00 Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning Díaz, Iván 2019 http://dx.doi.org/10.1093/biostatistics/kxz042 http://academic.oup.com/biostatistics/advance-article-pdf/doi/10.1093/biostatistics/kxz042/30988844/kxz042.pdf en eng Oxford University Press (OUP) https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model Biostatistics ISSN 1465-4644 1468-4357 journal-article 2019 croxfordunivpr https://doi.org/10.1093/biostatistics/kxz042 2024-08-19T04:23:25Z Summary In recent decades, the fields of statistical and machine learning have seen a revolution in the development of data-adaptive regression methods that have optimal performance under flexible, sometimes minimal, assumptions on the true regression functions. These developments have impacted all areas of applied and theoretical statistics and have allowed data analysts to avoid the biases incurred under the pervasive practice of parametric model misspecification. In this commentary, I discuss issues around the use of data-adaptive regression in estimation of causal inference parameters. To ground ideas, I focus on two estimation approaches with roots in semi-parametric estimation theory: targeted minimum loss-based estimation (TMLE; van der Laan and Rubin, 2006) and double/debiased machine learning (DML; Chernozhukov and others, 2018). This commentary is not comprehensive, the literature on these topics is rich, and there are many subtleties and developments which I do not address. These two frameworks represent only a small fraction of an increasingly large number of methods for causal inference using machine learning. To my knowledge, they are the only methods grounded in statistical semi-parametric theory that also allow unrestricted use of data-adaptive regression techniques. Article in Journal/Newspaper DML Oxford University Press Biostatistics |
institution |
Open Polar |
collection |
Oxford University Press |
op_collection_id |
croxfordunivpr |
language |
English |
description |
Summary In recent decades, the fields of statistical and machine learning have seen a revolution in the development of data-adaptive regression methods that have optimal performance under flexible, sometimes minimal, assumptions on the true regression functions. These developments have impacted all areas of applied and theoretical statistics and have allowed data analysts to avoid the biases incurred under the pervasive practice of parametric model misspecification. In this commentary, I discuss issues around the use of data-adaptive regression in estimation of causal inference parameters. To ground ideas, I focus on two estimation approaches with roots in semi-parametric estimation theory: targeted minimum loss-based estimation (TMLE; van der Laan and Rubin, 2006) and double/debiased machine learning (DML; Chernozhukov and others, 2018). This commentary is not comprehensive, the literature on these topics is rich, and there are many subtleties and developments which I do not address. These two frameworks represent only a small fraction of an increasingly large number of methods for causal inference using machine learning. To my knowledge, they are the only methods grounded in statistical semi-parametric theory that also allow unrestricted use of data-adaptive regression techniques. |
format |
Article in Journal/Newspaper |
author |
Díaz, Iván |
spellingShingle |
Díaz, Iván Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning |
author_facet |
Díaz, Iván |
author_sort |
Díaz, Iván |
title |
Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning |
title_short |
Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning |
title_full |
Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning |
title_fullStr |
Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning |
title_full_unstemmed |
Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning |
title_sort |
machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning |
publisher |
Oxford University Press (OUP) |
publishDate |
2019 |
url |
http://dx.doi.org/10.1093/biostatistics/kxz042 http://academic.oup.com/biostatistics/advance-article-pdf/doi/10.1093/biostatistics/kxz042/30988844/kxz042.pdf |
genre |
DML |
genre_facet |
DML |
op_source |
Biostatistics ISSN 1465-4644 1468-4357 |
op_rights |
https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model |
op_doi |
https://doi.org/10.1093/biostatistics/kxz042 |
container_title |
Biostatistics |
_version_ |
1810441324037406720 |