Applying double machine learning and BART methods to the American Causal Inference Conference 2022 Data Challenge

During the master thesis, Bayesian Additive Regression Tree (BART), Bayesian Causal Forest(BCF), and Double Machine Learning(DML) are applied to solve American Causal Inference Conference 2022 Data Challenge. Bayesian Causal Forest(BCF) is a variant of the Bayesian Additive regression tree (BART) mo...

Full description

Bibliographic Details
Main Author: Ruixuan Zhu
Format: Master Thesis
Language:English
Published: 2023
Subjects:
DML
Online Access:https://mediatum.ub.tum.de/1740110
https://mediatum.ub.tum.de/doc/1740110/document.pdf
Description
Summary:During the master thesis, Bayesian Additive Regression Tree (BART), Bayesian Causal Forest(BCF), and Double Machine Learning(DML) are applied to solve American Causal Inference Conference 2022 Data Challenge. Bayesian Causal Forest(BCF) is a variant of the Bayesian Additive regression tree (BART) model. The R language is used for all implementations. For evaluation of the performances of these three models, Root Mean Squared Error(RMSE), uncertainty interval coverage, uncertainty interval width, and absolute bias are employed as metrics. Root Mean Squared Error(RMSE) and uncertainty interval coverage are emphasized among the four metrics since they are highlighted by the Data Challenge host. The evaluations show that the three models all have a good performance regarding Root Mean Square Error(RMSE) and the two BART-based models have much better performances than Double Machine Learning(DML) in terms of uncertainty interval coverage. Within BART-based models, Bayesian Causal Forest(BCF) outperformed Bayesian Additive Regression Tree(BART). Moreover, the two BART-based models outperformed Double Machine Learning(DML) significantly concerning the subgroup estimands, which is crucial for dealing with treatment effect heterogeneity.