Statistical modeling of Southern Ocean marine diatom proxy and winter sea ice data: Model comparison and developments

International audience We compare the performance of the modern analog technique (MAT), the Imbrie and Kipp transfer function (IKTF), the generalized additive model (GAM) and weighted averaging partial least squares (WA PLS) on a southern hemisphere diatom relative abundance and winter sea ice conce...

Full description

Bibliographic Details
Published in:Progress in Oceanography
Main Authors: Ferry, Alexander, Prvan, Tania, Jersky, Brian, Crosta, Xavier, Armand, Leanne
Other Authors: Department of Biological Sciences North Ryde, Macquarie University, Environnements et Paléoenvironnements OCéaniques (EPOC), Observatoire aquitain des sciences de l'univers (OASU), Université Sciences et Technologies - Bordeaux 1-Institut national des sciences de l'Univers (INSU - CNRS)-Centre National de la Recherche Scientifique (CNRS)-Université Sciences et Technologies - Bordeaux 1-Institut national des sciences de l'Univers (INSU - CNRS)-Centre National de la Recherche Scientifique (CNRS)-École pratique des hautes études (EPHE), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)
Format: Article in Journal/Newspaper
Language:English
Published: HAL CCSD 2015
Subjects:
Gam
Online Access:https://hal.archives-ouvertes.fr/hal-02105563
https://doi.org/10.1016/j.pocean.2014.12.001
Description
Summary:International audience We compare the performance of the modern analog technique (MAT), the Imbrie and Kipp transfer function (IKTF), the generalized additive model (GAM) and weighted averaging partial least squares (WA PLS) on a southern hemisphere diatom relative abundance and winter sea ice concentration training data set. All relevant model assumptions are tested with a random 10-fold cross-validation, whilst a hold out cross-validation tested the explanatory power of each model on spatially independent validation data. We used auto correlograms on model residuals, variance partitioning, and principal coordinates analysis of neighbor matrices (PCNM) to investigate the importance of the spatial structure of our training database. A set of hierarchical logistic regression models (or Huisman–Olff–Fresco models) are used to infer the response of each diatom species along the winter sea ice gradient. Our analyses suggest that IKTF is an inappropriate sea ice estimation approach as its underlying statistical assumptions do not hold and the fit of IKTF to our data under cross-validation was poor. We conclude that MAT may be biased by spatial autocorrelation, and together with IKTF fails to provide unbiased estimates of winter sea ice. We find GAM and WA PLS are more appropriate than IKTF and MAT for the estimation of paleo winter sea ice cover throughout the Southern Ocean. However, as WA PLS is based on a unimodal species response, which is rarely exhibited by diatoms along the winter sea ice gradient, we ultimately advocate the application of GAM. GAM only uses diatoms with a statistically significant association, and ecologically based link, with sea ice. GAM outperformed all other models under cross-validation in terms of performance statistics, the fit of GAM to the training dataset and diagnostic tests for model assumptions. We also demonstrate that GAM provides a more detailed and potentially more accurate (based on a comparison with New Zealand and southeast Australian paleo climatic records) paleo winter ...