Regularizing Double Machine Learning in Partially Linear Endogenous Models

The linear coefficient in a partially linear model with confounding variables can be estimated using double machine learning (DML). However, this DML estimator has a two-stage least squares (TSLS) interpretation and may produce overly wide confidence intervals. To address this issue, we propose a re...

Full description

Bibliographic Details
Published in:Electronic Journal of Statistics
Main Authors: Emmenegger, Corinne, Bühlmann, Peter
Format: Text
Language:unknown
Published: 2021
Subjects:
DML
Online Access:http://arxiv.org/abs/2101.12525
https://doi.org/10.1214/21-EJS1931
id ftarxivpreprints:oai:arXiv.org:2101.12525
record_format openpolar
spelling ftarxivpreprints:oai:arXiv.org:2101.12525 2023-09-05T13:19:04+02:00 Regularizing Double Machine Learning in Partially Linear Endogenous Models Emmenegger, Corinne Bühlmann, Peter 2021-01-29 http://arxiv.org/abs/2101.12525 https://doi.org/10.1214/21-EJS1931 unknown http://arxiv.org/abs/2101.12525 doi:10.1214/21-EJS1931 Statistics - Methodology Mathematics - Statistics Theory text 2021 ftarxivpreprints https://doi.org/10.1214/21-EJS1931 2023-08-16T16:19:10Z The linear coefficient in a partially linear model with confounding variables can be estimated using double machine learning (DML). However, this DML estimator has a two-stage least squares (TSLS) interpretation and may produce overly wide confidence intervals. To address this issue, we propose a regularization and selection scheme, regsDML, which leads to narrower confidence intervals. It selects either the TSLS DML estimator or a regularization-only estimator depending on whose estimated variance is smaller. The regularization-only estimator is tailored to have a low mean squared error. The regsDML estimator is fully data driven. The regsDML estimator converges at the parametric rate, is asymptotically Gaussian distributed, and asymptotically equivalent to the TSLS DML estimator, but regsDML exhibits substantially better finite sample properties. The regsDML estimator uses the idea of k-class estimators, and we show how DML and k-class estimation can be combined to estimate the linear coefficient in a partially linear endogenous model. Empirical examples demonstrate our methodological and theoretical developments. Software code for our regsDML method is available in the R-package dmlalg. Comment: new content and revised text Text DML ArXiv.org (Cornell University Library) Electronic Journal of Statistics 15 2
institution Open Polar
collection ArXiv.org (Cornell University Library)
op_collection_id ftarxivpreprints
language unknown
topic Statistics - Methodology
Mathematics - Statistics Theory
spellingShingle Statistics - Methodology
Mathematics - Statistics Theory
Emmenegger, Corinne
Bühlmann, Peter
Regularizing Double Machine Learning in Partially Linear Endogenous Models
topic_facet Statistics - Methodology
Mathematics - Statistics Theory
description The linear coefficient in a partially linear model with confounding variables can be estimated using double machine learning (DML). However, this DML estimator has a two-stage least squares (TSLS) interpretation and may produce overly wide confidence intervals. To address this issue, we propose a regularization and selection scheme, regsDML, which leads to narrower confidence intervals. It selects either the TSLS DML estimator or a regularization-only estimator depending on whose estimated variance is smaller. The regularization-only estimator is tailored to have a low mean squared error. The regsDML estimator is fully data driven. The regsDML estimator converges at the parametric rate, is asymptotically Gaussian distributed, and asymptotically equivalent to the TSLS DML estimator, but regsDML exhibits substantially better finite sample properties. The regsDML estimator uses the idea of k-class estimators, and we show how DML and k-class estimation can be combined to estimate the linear coefficient in a partially linear endogenous model. Empirical examples demonstrate our methodological and theoretical developments. Software code for our regsDML method is available in the R-package dmlalg. Comment: new content and revised text
format Text
author Emmenegger, Corinne
Bühlmann, Peter
author_facet Emmenegger, Corinne
Bühlmann, Peter
author_sort Emmenegger, Corinne
title Regularizing Double Machine Learning in Partially Linear Endogenous Models
title_short Regularizing Double Machine Learning in Partially Linear Endogenous Models
title_full Regularizing Double Machine Learning in Partially Linear Endogenous Models
title_fullStr Regularizing Double Machine Learning in Partially Linear Endogenous Models
title_full_unstemmed Regularizing Double Machine Learning in Partially Linear Endogenous Models
title_sort regularizing double machine learning in partially linear endogenous models
publishDate 2021
url http://arxiv.org/abs/2101.12525
https://doi.org/10.1214/21-EJS1931
genre DML
genre_facet DML
op_relation http://arxiv.org/abs/2101.12525
doi:10.1214/21-EJS1931
op_doi https://doi.org/10.1214/21-EJS1931
container_title Electronic Journal of Statistics
container_volume 15
container_issue 2
_version_ 1776199883935449088