Forecasts, neural networks, and results from the paper: 'Seasonal Arctic sea ice forecasting with probabilistic deep learning'

This dataset encompasses data produced in the study 'Seasonal Arctic sea ice forecasting with probabilistic deep learning', published in Nature Communications. The study introduces a new Arctic sea ice forecasting AI system, IceNet, which predicts monthly-averaged sea ice probability (SIP;...

Full description

Bibliographic Details
Main Authors: Andersson, Tom R., Hosking, J. Scott
Format: Dataset
Language:English
Published: NERC EDS UK Polar Data Centre 2021
Subjects:
Online Access:https://dx.doi.org/10.5285/71820e7d-c628-4e32-969f-464b7efb187c
https://data.bas.ac.uk/full-record.php?id=GB/NERC/BAS/PDC/01526
Description
Summary:This dataset encompasses data produced in the study 'Seasonal Arctic sea ice forecasting with probabilistic deep learning', published in Nature Communications. The study introduces a new Arctic sea ice forecasting AI system, IceNet, which predicts monthly-averaged sea ice probability (SIP; probability of sea ice concentration > 15%) up to 6 months ahead at 25 km resolution. The study demonstrated IceNet's superior seasonal forecasting skill over a state-of-the-art physics-based sea ice forecasting system, ECMWF SEAS5, and a statistical benchmark. This dataset includes three types of data from the study. Firstly, IceNet's SIP forecasts from 2012/1 - 2020/9. Secondly, the 25 neural network files underlying the IceNet model. Thirdly, spreadsheets of results from the study. The codebase associated with this work includes a script to download this dataset and reproduce all the paper's figures. This dataset is supported by Wave 1 of The UKRI Strategic Priorities Fund under the EPSRC Grant EP/T001569/1, particularly the "AI for Science" theme within that grant and The Alan Turing Institute. The dataset is also supported by the NERC ACSIS project (grant NE/N018028/1). : The IceNet model comprises an ensemble of 25 individual U-Net deep learning models, whose forecasts are averaged to compute the ensemble mean. IceNet's monthly-averaged inputs comprise SIC, 11 climate variables, statistical sea ice concentration (SIC) forecasts, and metadata. IceNet is trained to forecast the next 6 months of monthly-averaged SIC classification maps at 25 km resolution. At each grid cell and lead time, IceNet's ensemble members produce a discrete probability distribution over three SIC classes: SIC < 15%, 15% < SIC < 80%, and SIC > 80%. The latter two SIC classes are summed to obtain the sea ice probability, P(SIC > 15%). IceNet's training data comprises climate simulations covering 1850-2100 and observational (reanalysis and satellite) data from 1979-2011. Observational data from 2012-2017 was used to validate the model during production, and 2018-2020 was used as the final test set. After training the IceNet model, we calibrated IceNet's probabilities using 2012-2017 data using an approach called temperature scaling. We then used the held-out data from 2012-2020 to compare IceNet's forecasting skill with a dynamical model (ECMWF SEAS5) and a statistical benchmark (a linear trend extrapolation model). A binary accuracy metric was used to measure performance, which computes the percentage of grid cells with the correct binary prediction for SIC > 15%. We then devised a framework for bounding the ice edge based on predicted SIP values and analysed the ability of IceNet and SEAS5 to bound the ice edge. Finally, we used a variable importance method (permute-and-predict) to identify the climate variables most important for IceNet's forecasts. Full details on the methodology behind the generation of this dataset can be found in the associated paper, particularly the Methods section, as well as the GitHub codebase We thank the contributors to the Sea Ice Outlooks from 2012 to 2020, whose sea ice extent predictions are used for the sea_ice_outlook_errors.csv file. : All data was generated using Python v3.7. The IceNet model was developed using the Python package TensorFlow v2.2 : IceNet makes predictions based on ERA5 reanalysis data and OSI-SAF SIC data - for information on their errors see their associated documentation. IceNet's SIP values were set to zero over a land mask and outside of a monthly maximum SIC climatology mask obtained from OSI-SAF.