Three applications of machine learning methods in corporate finance

This thesis focuses on three applications of machine learning methods in corporate finance. The first two applications (Chapter 2 and 3) are dedicated to two applications of double (or debiased) machine learning (DML) on corporate cash holdings, and merger returns, respectively. The third applicatio...

Full description

Bibliographic Details
Main Author: Movaghari, Hadi
Format: Thesis
Language:English
Published: 2024
Subjects:
DML
Online Access:https://theses.gla.ac.uk/84298/
https://theses.gla.ac.uk/84298/2/2024MovaghariPhD.pdf
https://doi.org/10.5525/gla.thesis.84298
Description
Summary:This thesis focuses on three applications of machine learning methods in corporate finance. The first two applications (Chapter 2 and 3) are dedicated to two applications of double (or debiased) machine learning (DML) on corporate cash holdings, and merger returns, respectively. The third application (Chapter 4) is related to empirical evaluation of the heterogeneous impacts of cost of carry on cash holdings using the causal forest (CF) method. I also provide a comprehensive introduction to machine learning techniques and the potential benefits that these methods can bring to enhance the effectiveness of data analysis in the field of finance (Chapter 1). The motivation for using DML is the existence of a large number of explanatory variables in the relevant literature. The increase of features in a system probably causes a high degree of non-linearities and hidden complex inter-relationships between covariates. Traditional machine learning methods which rely on the linearity assumption, like LASSO, cannot handle these ill-conditions. Another weakness that such traditional methods suffer from is omitted variable bias. This means that variables that are probably relevant in predicting the dependent variable are left out due to model selection mistakes. The DML method allows the modelling of non-linearities by incorporating specialized machine learning methods like gradient boosting method. In addition, it resolves the omitted variable bias of naïve estimator through double usage of machine learning methods in the step of nuisance functions estimation. The motivation for using CF is that we aim to examine the possible heterogeneity at the firmlevel, instead of estimating the average relationship across all firms. In fact, CF is a random forest based method to examine the possible heterogeneity at the level of individuals. Although such heterogeneity can be detected by conventional approaches such as subsample analysis, such an approach has two shortages: data snooping bias and preventing the development of new ...