A Statistical Modeling Framework for Characterising Uncertainty in Large Datasets: Application to Ocean Colour

Uncertainty estimation is crucial to establishing confidence in any data analysis, and this is especially true for Essential Climate Variables, including ocean colour. Methods for deriving uncertainty vary greatly across data types, so a generic statistics-based approach applicable to multiple data...

Full description

Bibliographic Details
Published in:Remote Sensing
Main Authors: Land, PE, Bailey, TC, Taberner, M, Pardo, S, Sathyendranath, S, Nejabati Zenouz, K, Brammall, V, Shutler, JD, Quartly, GD
Format: Article in Journal/Newspaper
Language:English
Published: MDPI 2018
Subjects:
Online Access:http://plymsea.ac.uk/id/eprint/7885/
http://plymsea.ac.uk/id/eprint/7885/1/remotesensing-10-00695.pdf
http://www.mdpi.com/2072-4292/10/5/695
https://doi.org/10.3390/rs10050695
Description
Summary:Uncertainty estimation is crucial to establishing confidence in any data analysis, and this is especially true for Essential Climate Variables, including ocean colour. Methods for deriving uncertainty vary greatly across data types, so a generic statistics-based approach applicable to multiple data types is an advantage to simplify the use and understanding of uncertainty data. Progress towards rigorous uncertainty analysis of ocean colour has been slow, in part because of the complexity of ocean colour processing. Here, we present a general approach to uncertainty characterisation, using a database of satellite-in situ matchups to generate a statistical model of satellite uncertainty as a function of its contributing variables. With an example NASA MODIS-Aqua chlorophyll-a matchups database mostly covering the north Atlantic, we demonstrate a model that explains 67% of the squared error in log(chlorophyll-a) as a potentially correctable bias, with the remaining uncertainty being characterised as standard deviation and standard error at each pixel. The method is quite general, depending only on the existence of a suitable database of matchups or reference values, and can be applied to other sensors and data types such as other satellite observed Essential Climate Variables, empirical algorithms derived from in situ data, or even model data.