Double/debiased machine learning for treatment and structural parameters

We revisit the classic semi‐parametric problem of inference on a low‐dimensional parameter θ0 in the presence of high‐dimensional nuisance parameters η0. We depart from the classical setting by allowing for η0 to be so high‐dimensional that the traditional assumptions (e.g. Donsker properties) that...

Full description

Bibliographic Details
Published in:	The Econometrics Journal
Main Authors:	Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, James Robins
Format:	Article in Journal/Newspaper
Language:	unknown
Subjects:	DML
Online Access:	https://doi.org/10.1111/ectj.12097

id	ftrepec:oai:RePEc:wly:emjrnl:v:21:y:2018:i:1:p:c1-c68
record_format	openpolar
spelling	ftrepec:oai:RePEc:wly:emjrnl:v:21:y:2018:i:1:p:c1-c68 2023-05-15T16:01:18+02:00 Double/debiased machine learning for treatment and structural parameters Victor Chernozhukov Denis Chetverikov Mert Demirer Esther Duflo Christian Hansen Whitney Newey James Robins https://doi.org/10.1111/ectj.12097 unknown https://doi.org/10.1111/ectj.12097 article ftrepec https://doi.org/10.1111/ectj.12097 2020-12-04T13:30:59Z We revisit the classic semi‐parametric problem of inference on a low‐dimensional parameter θ0 in the presence of high‐dimensional nuisance parameters η0. We depart from the classical setting by allowing for η0 to be so high‐dimensional that the traditional assumptions (e.g. Donsker properties) that limit complexity of the parameter space for this object break down. To estimate η0, we consider the use of statistical or machine learning (ML) methods, which are particularly well suited to estimation in modern, very high‐dimensional cases. ML methods perform well by employing regularization to reduce variance and trading off regularization bias with overfitting in practice. However, both regularization bias and overfitting in estimating η0 cause a heavy bias in estimators of θ0 that are obtained by naively plugging ML estimators of η0 into estimating equations for θ0. This bias results in the naive estimator failing to be N−1/2 consistent, where N is the sample size. We show that the impact of regularization bias and overfitting on estimation of the parameter of interest θ0 can be removed by using two simple, yet critical, ingredients: (1) using Neyman‐orthogonal moments/scores that have reduced sensitivity with respect to nuisance parameters to estimate θ0; (2) making use of cross‐fitting, which provides an efficient form of data‐splitting. We call the resulting set of methods double or debiased ML (DML). We verify that DML delivers point estimators that concentrate in an N−1/2‐neighbourhood of the true parameter values and are approximately unbiased and normally distributed, which allows construction of valid confidence statements. The generic statistical theory of DML is elementary and simultaneously relies on only weak theoretical requirements, which will admit the use of a broad array of modern ML methods for estimating the nuisance parameters, such as random forests, lasso, ridge, deep neural nets, boosted trees, and various hybrids and ensembles of these methods. We illustrate the general theory by applying it to provide theoretical properties of the following: DML applied to learn the main regression parameter in a partially linear regression model; DML applied to learn the coefficient on an endogenous variable in a partially linear instrumental variables model; DML applied to learn the average treatment effect and the average treatment effect on the treated under unconfoundedness; DML applied to learn the local average treatment effect in an instrumental variables setting. In addition to these theoretical applications, we also illustrate the use of DML in three empirical examples. Article in Journal/Newspaper DML RePEc (Research Papers in Economics) The Econometrics Journal 21 1 C1 C68
institution	Open Polar
collection	RePEc (Research Papers in Economics)
op_collection_id	ftrepec
language	unknown
description	We revisit the classic semi‐parametric problem of inference on a low‐dimensional parameter θ0 in the presence of high‐dimensional nuisance parameters η0. We depart from the classical setting by allowing for η0 to be so high‐dimensional that the traditional assumptions (e.g. Donsker properties) that limit complexity of the parameter space for this object break down. To estimate η0, we consider the use of statistical or machine learning (ML) methods, which are particularly well suited to estimation in modern, very high‐dimensional cases. ML methods perform well by employing regularization to reduce variance and trading off regularization bias with overfitting in practice. However, both regularization bias and overfitting in estimating η0 cause a heavy bias in estimators of θ0 that are obtained by naively plugging ML estimators of η0 into estimating equations for θ0. This bias results in the naive estimator failing to be N−1/2 consistent, where N is the sample size. We show that the impact of regularization bias and overfitting on estimation of the parameter of interest θ0 can be removed by using two simple, yet critical, ingredients: (1) using Neyman‐orthogonal moments/scores that have reduced sensitivity with respect to nuisance parameters to estimate θ0; (2) making use of cross‐fitting, which provides an efficient form of data‐splitting. We call the resulting set of methods double or debiased ML (DML). We verify that DML delivers point estimators that concentrate in an N−1/2‐neighbourhood of the true parameter values and are approximately unbiased and normally distributed, which allows construction of valid confidence statements. The generic statistical theory of DML is elementary and simultaneously relies on only weak theoretical requirements, which will admit the use of a broad array of modern ML methods for estimating the nuisance parameters, such as random forests, lasso, ridge, deep neural nets, boosted trees, and various hybrids and ensembles of these methods. We illustrate the general theory by applying it to provide theoretical properties of the following: DML applied to learn the main regression parameter in a partially linear regression model; DML applied to learn the coefficient on an endogenous variable in a partially linear instrumental variables model; DML applied to learn the average treatment effect and the average treatment effect on the treated under unconfoundedness; DML applied to learn the local average treatment effect in an instrumental variables setting. In addition to these theoretical applications, we also illustrate the use of DML in three empirical examples.
format	Article in Journal/Newspaper
author	Victor Chernozhukov Denis Chetverikov Mert Demirer Esther Duflo Christian Hansen Whitney Newey James Robins
spellingShingle	Victor Chernozhukov Denis Chetverikov Mert Demirer Esther Duflo Christian Hansen Whitney Newey James Robins Double/debiased machine learning for treatment and structural parameters
author_facet	Victor Chernozhukov Denis Chetverikov Mert Demirer Esther Duflo Christian Hansen Whitney Newey James Robins
author_sort	Victor Chernozhukov
title	Double/debiased machine learning for treatment and structural parameters
title_short	Double/debiased machine learning for treatment and structural parameters
title_full	Double/debiased machine learning for treatment and structural parameters
title_fullStr	Double/debiased machine learning for treatment and structural parameters
title_full_unstemmed	Double/debiased machine learning for treatment and structural parameters
title_sort	double/debiased machine learning for treatment and structural parameters
url	https://doi.org/10.1111/ectj.12097
genre	DML
genre_facet	DML
op_relation	https://doi.org/10.1111/ectj.12097
op_doi	https://doi.org/10.1111/ectj.12097
container_title	The Econometrics Journal
container_volume	21
container_issue	1
container_start_page	C1
op_container_end_page	C68
_version_	1766397224608071680

Double/debiased machine learning for treatment and structural parameters

Similar Items