Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning

Summary In recent decades, the fields of statistical and machine learning have seen a revolution in the development of data-adaptive regression methods that have optimal performance under flexible, sometimes minimal, assumptions on the true regression functions. These developments have impacted all...

Full description

Bibliographic Details
Published in:Biostatistics
Main Author: Díaz, Iván
Format: Article in Journal/Newspaper
Language:English
Published: Oxford University Press (OUP) 2019
Subjects:
DML
Online Access:http://dx.doi.org/10.1093/biostatistics/kxz042
http://academic.oup.com/biostatistics/advance-article-pdf/doi/10.1093/biostatistics/kxz042/30988844/kxz042.pdf
id croxfordunivpr:10.1093/biostatistics/kxz042
record_format openpolar
spelling croxfordunivpr:10.1093/biostatistics/kxz042 2024-09-15T18:03:52+00:00 Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning Díaz, Iván 2019 http://dx.doi.org/10.1093/biostatistics/kxz042 http://academic.oup.com/biostatistics/advance-article-pdf/doi/10.1093/biostatistics/kxz042/30988844/kxz042.pdf en eng Oxford University Press (OUP) https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model Biostatistics ISSN 1465-4644 1468-4357 journal-article 2019 croxfordunivpr https://doi.org/10.1093/biostatistics/kxz042 2024-08-19T04:23:25Z Summary In recent decades, the fields of statistical and machine learning have seen a revolution in the development of data-adaptive regression methods that have optimal performance under flexible, sometimes minimal, assumptions on the true regression functions. These developments have impacted all areas of applied and theoretical statistics and have allowed data analysts to avoid the biases incurred under the pervasive practice of parametric model misspecification. In this commentary, I discuss issues around the use of data-adaptive regression in estimation of causal inference parameters. To ground ideas, I focus on two estimation approaches with roots in semi-parametric estimation theory: targeted minimum loss-based estimation (TMLE; van der Laan and Rubin, 2006) and double/debiased machine learning (DML; Chernozhukov and others, 2018). This commentary is not comprehensive, the literature on these topics is rich, and there are many subtleties and developments which I do not address. These two frameworks represent only a small fraction of an increasingly large number of methods for causal inference using machine learning. To my knowledge, they are the only methods grounded in statistical semi-parametric theory that also allow unrestricted use of data-adaptive regression techniques. Article in Journal/Newspaper DML Oxford University Press Biostatistics
institution Open Polar
collection Oxford University Press
op_collection_id croxfordunivpr
language English
description Summary In recent decades, the fields of statistical and machine learning have seen a revolution in the development of data-adaptive regression methods that have optimal performance under flexible, sometimes minimal, assumptions on the true regression functions. These developments have impacted all areas of applied and theoretical statistics and have allowed data analysts to avoid the biases incurred under the pervasive practice of parametric model misspecification. In this commentary, I discuss issues around the use of data-adaptive regression in estimation of causal inference parameters. To ground ideas, I focus on two estimation approaches with roots in semi-parametric estimation theory: targeted minimum loss-based estimation (TMLE; van der Laan and Rubin, 2006) and double/debiased machine learning (DML; Chernozhukov and others, 2018). This commentary is not comprehensive, the literature on these topics is rich, and there are many subtleties and developments which I do not address. These two frameworks represent only a small fraction of an increasingly large number of methods for causal inference using machine learning. To my knowledge, they are the only methods grounded in statistical semi-parametric theory that also allow unrestricted use of data-adaptive regression techniques.
format Article in Journal/Newspaper
author Díaz, Iván
spellingShingle Díaz, Iván
Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning
author_facet Díaz, Iván
author_sort Díaz, Iván
title Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning
title_short Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning
title_full Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning
title_fullStr Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning
title_full_unstemmed Machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning
title_sort machine learning in the estimation of causal effects: targeted minimum loss-based estimation and double/debiased machine learning
publisher Oxford University Press (OUP)
publishDate 2019
url http://dx.doi.org/10.1093/biostatistics/kxz042
http://academic.oup.com/biostatistics/advance-article-pdf/doi/10.1093/biostatistics/kxz042/30988844/kxz042.pdf
genre DML
genre_facet DML
op_source Biostatistics
ISSN 1465-4644 1468-4357
op_rights https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model
op_doi https://doi.org/10.1093/biostatistics/kxz042
container_title Biostatistics
_version_ 1810441324037406720