Code of Dietel et al.: "Combined impacts of temperature, sea ice coverage, and mixing ratios of sea spray and dust on cloud phase over the Arctic and Southern Oceans", submitted to Geophysical Research Letters

# Code of Dietel et al.: "Combined impacts of temperature, sea ice coverage, and mixing ratios of sea spray and dust on cloud phase over the Arctic and Southern Oceans", submitted to Geophysical Research Letters ## Scripts to train a machine learning model (Histogram based gradient boostin...

Full description

Bibliographic Details
Main Authors: Dietel, Barbara, Andersen, Hendrik, Cermak, Jan, Stier, Philip, Hoose, Corinna
Format: Dataset
Language:unknown
Published: Karlsruhe Institute of Technology 2024
Subjects:
Online Access:https://doi.org/10.35097/VEbaqHtbXdEzreqO
Description
Summary:# Code of Dietel et al.: "Combined impacts of temperature, sea ice coverage, and mixing ratios of sea spray and dust on cloud phase over the Arctic and Southern Oceans", submitted to Geophysical Research Letters ## Scripts to train a machine learning model (Histogram based gradient boosting regression with scikitlearn) and calculate SHapley Additive exPlanation (SHAP) values The machine learning model can predict the liquid fraction in different cloud types based on four parameters, namely the cloud top temperature, the sea ice concentration, the dust mixing ratio and the sea salt mixing ratio. More information on the used dataset can be found here: [Dietel et al. 2023](https://egusphere.copernicus.org/preprints/2023/egusphere-2023-2281/) ### Bash-scripts The bash scripts are used to run the python scripts for different cloud types and regions on a cluster. bash-scripts starting with ***GBR_[.]*** (Gradient Boosting Regression) run the python-script ***hist_gbr_subset_final2_with_comments.py*** for different regions (Arctic Ocean (AO), Southern Ocean (SO)) and different cloud types (low-level, mid-level,mid-to-low-level). bash-scripts starting with ***shap_values_***[.] run the python-script ***shap_values-subset-final2_with_comments.py*** to calculate SHAP values based on the trained machine learning models for a 500 000 sample subset of the validation dataset. ### Python scripts ***hist_gbr_subset_final2_with_comments.py*** Python script to train the a Histogram-based Gradient Boosting Regression model using the scikitlearn python package. More detailed information can be found as comments in the scripts. ***shap_values-subset-final2_with_comments.py*** Calculates SHAP values for a 500 000 sample subset of the validation dataset to make the machine learning model explainable. More detailed information can be found as comments in the scripts.