Modelling point mass balance for the glaciers of the Central European Alps using machine learning techniques

Glacier mass balance is typically estimated using a range of in situ measurements, remote sensing measurements, and physical and temperature index modelling techniques. With improved data collection and access to large datasets, data-driven techniques have recently gained prominence in modelling nat...

Full description

Bibliographic Details
Published in:The Cryosphere
Main Authors: R. Anilkumar, R. Bharti, D. Chutia, S. P. Aggarwal
Format: Article in Journal/Newspaper
Language:English
Published: Copernicus Publications 2023
Subjects:
Online Access:https://doi.org/10.5194/tc-17-2811-2023
https://doaj.org/article/fa2073f341d4476d95de015e59998930
Description
Summary:Glacier mass balance is typically estimated using a range of in situ measurements, remote sensing measurements, and physical and temperature index modelling techniques. With improved data collection and access to large datasets, data-driven techniques have recently gained prominence in modelling natural processes. The most common data-driven techniques used today are linear regression models and, to some extent, non-linear machine learning models such as artificial neural networks. However, the entire host of capabilities of machine learning modelling has not been applied to glacier mass balance modelling. This study used monthly meteorological data from ERA5-Land to drive four machine learning models: random forest (ensemble tree type), gradient-boosted regressor (ensemble tree type), support vector machine (kernel type), and artificial neural networks (neural type). We also use ordinary least squares linear regression as a baseline model against which to compare the performance of the machine learning models. Further, we assess the requirement of data for each of the models and the requirement for hyperparameter tuning. Finally, the importance of each meteorological variable in the mass balance estimation for each of the models is estimated using permutation importance. All machine learning models outperform the linear regression model. The neural network model depicted a low bias, suggesting the possibility of enhanced results in the event of biased input data. However, the ensemble tree-based models, random forest and gradient-boosted regressor, outperformed all other models in terms of the evaluation metrics and interpretability of the meteorological variables. The gradient-boosted regression model depicted the best coefficient of determination value of 0.713 and a root mean squared error of 1.071 m w.e. The feature importance values associated with all machine learning models suggested a high importance of meteorological variables associated with ablation. This is in line with predominantly negative mass ...