Regularizing double machine learning in partially linear endogenous models
The linear coefficient in a partially linear model with confounding variables can be estimated using double machine learning (DML). However, this DML estimator has a two-stage least squares (TSLS) interpretation and may produce overly wide confidence intervals. To address this issue, we propose a re...
Main Authors: | , , |
---|---|
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
Cornell University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/20.500.11850/525119 https://doi.org/10.3929/ethz-b-000525119 |
id |
ftethz:oai:www.research-collection.ethz.ch:20.500.11850/525119 |
---|---|
record_format |
openpolar |
spelling |
ftethz:oai:www.research-collection.ethz.ch:20.500.11850/525119 2023-08-15T12:41:04+02:00 Regularizing double machine learning in partially linear endogenous models Emmenegger, Corinne id_orcid:0 000-0003-0353-8888 Bühlmann, Peter 2021 application/application/pdf https://hdl.handle.net/20.500.11850/525119 https://doi.org/10.3929/ethz-b-000525119 en eng Cornell University info:eu-repo/semantics/altIdentifier/doi/10.1214/21-ejs1931 info:eu-repo/semantics/altIdentifier/wos/000740666000062 info:eu-repo/grantAgreement/EC/H2020/786461 http://hdl.handle.net/20.500.11850/525119 doi:10.3929/ethz-b-000525119 info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/4.0/ Creative Commons Attribution 4.0 International Electronic Journal of Statistics, 15 (2) Double machine learning Endogenous variables Generalized method of moments Instrumental variables K-class estimation Partially linear model Regularization Semiparametric estimation Two-stage least squares info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion 2021 ftethz https://doi.org/20.500.11850/52511910.3929/ethz-b-00052511910.1214/21-ejs1931 2023-07-23T23:47:07Z The linear coefficient in a partially linear model with confounding variables can be estimated using double machine learning (DML). However, this DML estimator has a two-stage least squares (TSLS) interpretation and may produce overly wide confidence intervals. To address this issue, we propose a regularization and selection scheme, regsDML, which leads to narrower confidence intervals. It selects either the TSLS DML estimator or a regularization-only estimator depending on whose estimated variance is smaller. The regularization-only estimator is tailored to have a low mean squared error. The regsDML estimator is fully data driven. The regsDML estimator converges at the parametric rate, is asymptotically Gaussian distributed, and asymptotically equivalent to the TSLS DML estimator, but regsDML exhibits substantially better finite sample properties. The regsDML estimator uses the idea of k-class estimators, and we show how DML and k-class estimation can be combined to estimate the linear coefficient in a partially linear endogenous model. Empirical examples demonstrate our methodological and theoretical developments. Software code for our regsDML method is available in the R-package dmlalg. ISSN:1935-7524 Article in Journal/Newspaper DML ETH Zürich Research Collection |
institution |
Open Polar |
collection |
ETH Zürich Research Collection |
op_collection_id |
ftethz |
language |
English |
topic |
Double machine learning Endogenous variables Generalized method of moments Instrumental variables K-class estimation Partially linear model Regularization Semiparametric estimation Two-stage least squares |
spellingShingle |
Double machine learning Endogenous variables Generalized method of moments Instrumental variables K-class estimation Partially linear model Regularization Semiparametric estimation Two-stage least squares Emmenegger, Corinne id_orcid:0 000-0003-0353-8888 Bühlmann, Peter Regularizing double machine learning in partially linear endogenous models |
topic_facet |
Double machine learning Endogenous variables Generalized method of moments Instrumental variables K-class estimation Partially linear model Regularization Semiparametric estimation Two-stage least squares |
description |
The linear coefficient in a partially linear model with confounding variables can be estimated using double machine learning (DML). However, this DML estimator has a two-stage least squares (TSLS) interpretation and may produce overly wide confidence intervals. To address this issue, we propose a regularization and selection scheme, regsDML, which leads to narrower confidence intervals. It selects either the TSLS DML estimator or a regularization-only estimator depending on whose estimated variance is smaller. The regularization-only estimator is tailored to have a low mean squared error. The regsDML estimator is fully data driven. The regsDML estimator converges at the parametric rate, is asymptotically Gaussian distributed, and asymptotically equivalent to the TSLS DML estimator, but regsDML exhibits substantially better finite sample properties. The regsDML estimator uses the idea of k-class estimators, and we show how DML and k-class estimation can be combined to estimate the linear coefficient in a partially linear endogenous model. Empirical examples demonstrate our methodological and theoretical developments. Software code for our regsDML method is available in the R-package dmlalg. ISSN:1935-7524 |
format |
Article in Journal/Newspaper |
author |
Emmenegger, Corinne id_orcid:0 000-0003-0353-8888 Bühlmann, Peter |
author_facet |
Emmenegger, Corinne id_orcid:0 000-0003-0353-8888 Bühlmann, Peter |
author_sort |
Emmenegger, Corinne |
title |
Regularizing double machine learning in partially linear endogenous models |
title_short |
Regularizing double machine learning in partially linear endogenous models |
title_full |
Regularizing double machine learning in partially linear endogenous models |
title_fullStr |
Regularizing double machine learning in partially linear endogenous models |
title_full_unstemmed |
Regularizing double machine learning in partially linear endogenous models |
title_sort |
regularizing double machine learning in partially linear endogenous models |
publisher |
Cornell University |
publishDate |
2021 |
url |
https://hdl.handle.net/20.500.11850/525119 https://doi.org/10.3929/ethz-b-000525119 |
genre |
DML |
genre_facet |
DML |
op_source |
Electronic Journal of Statistics, 15 (2) |
op_relation |
info:eu-repo/semantics/altIdentifier/doi/10.1214/21-ejs1931 info:eu-repo/semantics/altIdentifier/wos/000740666000062 info:eu-repo/grantAgreement/EC/H2020/786461 http://hdl.handle.net/20.500.11850/525119 doi:10.3929/ethz-b-000525119 |
op_rights |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/4.0/ Creative Commons Attribution 4.0 International |
op_doi |
https://doi.org/20.500.11850/52511910.3929/ethz-b-00052511910.1214/21-ejs1931 |
_version_ |
1774294150800211968 |