Regularizing Double Machine Learning in Partially Linear Endogenous Models
The linear coefficient in a partially linear model with confounding variables can be estimated using double machine learning (DML). However, this DML estimator has a two-stage least squares (TSLS) interpretation and may produce overly wide confidence intervals. To address this issue, we propose a re...
Published in: | Electronic Journal of Statistics |
---|---|
Main Authors: | , |
Format: | Text |
Language: | unknown |
Published: |
2021
|
Subjects: | |
Online Access: | http://arxiv.org/abs/2101.12525 https://doi.org/10.1214/21-EJS1931 |
id |
ftarxivpreprints:oai:arXiv.org:2101.12525 |
---|---|
record_format |
openpolar |
spelling |
ftarxivpreprints:oai:arXiv.org:2101.12525 2023-09-05T13:19:04+02:00 Regularizing Double Machine Learning in Partially Linear Endogenous Models Emmenegger, Corinne Bühlmann, Peter 2021-01-29 http://arxiv.org/abs/2101.12525 https://doi.org/10.1214/21-EJS1931 unknown http://arxiv.org/abs/2101.12525 doi:10.1214/21-EJS1931 Statistics - Methodology Mathematics - Statistics Theory text 2021 ftarxivpreprints https://doi.org/10.1214/21-EJS1931 2023-08-16T16:19:10Z The linear coefficient in a partially linear model with confounding variables can be estimated using double machine learning (DML). However, this DML estimator has a two-stage least squares (TSLS) interpretation and may produce overly wide confidence intervals. To address this issue, we propose a regularization and selection scheme, regsDML, which leads to narrower confidence intervals. It selects either the TSLS DML estimator or a regularization-only estimator depending on whose estimated variance is smaller. The regularization-only estimator is tailored to have a low mean squared error. The regsDML estimator is fully data driven. The regsDML estimator converges at the parametric rate, is asymptotically Gaussian distributed, and asymptotically equivalent to the TSLS DML estimator, but regsDML exhibits substantially better finite sample properties. The regsDML estimator uses the idea of k-class estimators, and we show how DML and k-class estimation can be combined to estimate the linear coefficient in a partially linear endogenous model. Empirical examples demonstrate our methodological and theoretical developments. Software code for our regsDML method is available in the R-package dmlalg. Comment: new content and revised text Text DML ArXiv.org (Cornell University Library) Electronic Journal of Statistics 15 2 |
institution |
Open Polar |
collection |
ArXiv.org (Cornell University Library) |
op_collection_id |
ftarxivpreprints |
language |
unknown |
topic |
Statistics - Methodology Mathematics - Statistics Theory |
spellingShingle |
Statistics - Methodology Mathematics - Statistics Theory Emmenegger, Corinne Bühlmann, Peter Regularizing Double Machine Learning in Partially Linear Endogenous Models |
topic_facet |
Statistics - Methodology Mathematics - Statistics Theory |
description |
The linear coefficient in a partially linear model with confounding variables can be estimated using double machine learning (DML). However, this DML estimator has a two-stage least squares (TSLS) interpretation and may produce overly wide confidence intervals. To address this issue, we propose a regularization and selection scheme, regsDML, which leads to narrower confidence intervals. It selects either the TSLS DML estimator or a regularization-only estimator depending on whose estimated variance is smaller. The regularization-only estimator is tailored to have a low mean squared error. The regsDML estimator is fully data driven. The regsDML estimator converges at the parametric rate, is asymptotically Gaussian distributed, and asymptotically equivalent to the TSLS DML estimator, but regsDML exhibits substantially better finite sample properties. The regsDML estimator uses the idea of k-class estimators, and we show how DML and k-class estimation can be combined to estimate the linear coefficient in a partially linear endogenous model. Empirical examples demonstrate our methodological and theoretical developments. Software code for our regsDML method is available in the R-package dmlalg. Comment: new content and revised text |
format |
Text |
author |
Emmenegger, Corinne Bühlmann, Peter |
author_facet |
Emmenegger, Corinne Bühlmann, Peter |
author_sort |
Emmenegger, Corinne |
title |
Regularizing Double Machine Learning in Partially Linear Endogenous Models |
title_short |
Regularizing Double Machine Learning in Partially Linear Endogenous Models |
title_full |
Regularizing Double Machine Learning in Partially Linear Endogenous Models |
title_fullStr |
Regularizing Double Machine Learning in Partially Linear Endogenous Models |
title_full_unstemmed |
Regularizing Double Machine Learning in Partially Linear Endogenous Models |
title_sort |
regularizing double machine learning in partially linear endogenous models |
publishDate |
2021 |
url |
http://arxiv.org/abs/2101.12525 https://doi.org/10.1214/21-EJS1931 |
genre |
DML |
genre_facet |
DML |
op_relation |
http://arxiv.org/abs/2101.12525 doi:10.1214/21-EJS1931 |
op_doi |
https://doi.org/10.1214/21-EJS1931 |
container_title |
Electronic Journal of Statistics |
container_volume |
15 |
container_issue |
2 |
_version_ |
1776199883935449088 |