Empirical methods for the estimation of Southern Ocean CO 2 : support vector and random forest regression

The Southern Ocean accounts for 40 % of oceanic CO 2 uptake, but the estimates are bound by large uncertainties due to a paucity in observations. Gap-filling empirical methods have been used to good effect to approximate p CO 2 from satellite observable variables in other parts of the ocean, but man...

Full description

Bibliographic Details
Published in:Biogeosciences
Main Authors: L. Gregor, S. Kok, P. M. S. Monteiro
Format: Article in Journal/Newspaper
Language:English
Published: Copernicus Publications 2017
Subjects:
Online Access:https://doi.org/10.5194/bg-14-5551-2017
https://doaj.org/article/81f909781272422d9f780891f66b2513
id ftdoajarticles:oai:doaj.org/article:81f909781272422d9f780891f66b2513
record_format openpolar
spelling ftdoajarticles:oai:doaj.org/article:81f909781272422d9f780891f66b2513 2023-05-15T18:24:44+02:00 Empirical methods for the estimation of Southern Ocean CO 2 : support vector and random forest regression L. Gregor S. Kok P. M. S. Monteiro 2017-12-01T00:00:00Z https://doi.org/10.5194/bg-14-5551-2017 https://doaj.org/article/81f909781272422d9f780891f66b2513 EN eng Copernicus Publications https://www.biogeosciences.net/14/5551/2017/bg-14-5551-2017.pdf https://doaj.org/toc/1726-4170 https://doaj.org/toc/1726-4189 doi:10.5194/bg-14-5551-2017 1726-4170 1726-4189 https://doaj.org/article/81f909781272422d9f780891f66b2513 Biogeosciences, Vol 14, Pp 5551-5569 (2017) Ecology QH540-549.5 Life QH501-531 Geology QE1-996.5 article 2017 ftdoajarticles https://doi.org/10.5194/bg-14-5551-2017 2022-12-31T16:02:13Z The Southern Ocean accounts for 40 % of oceanic CO 2 uptake, but the estimates are bound by large uncertainties due to a paucity in observations. Gap-filling empirical methods have been used to good effect to approximate p CO 2 from satellite observable variables in other parts of the ocean, but many of these methods are not in agreement in the Southern Ocean. In this study we propose two additional methods that perform well in the Southern Ocean: support vector regression (SVR) and random forest regression (RFR). The methods are used to estimate Δ p CO 2 in the Southern Ocean based on SOCAT v3, achieving similar trends to the SOM-FFN method by Landschützer et al. (2014). Results show that the SOM-FFN and RFR approaches have RMSEs of similar magnitude (14.84 and 16.45 µatm, where 1 atm = 101 325 Pa) where the SVR method has a larger RMSE (24.40 µatm). However, the larger errors for SVR and RFR are, in part, due to an increase in coastal observations from SOCAT v2 to v3, where the SOM-FFN method used v2 data. The success of both SOM-FFN and RFR depends on the ability to adapt to different modes of variability. The SOM-FFN achieves this by having independent regression models for each cluster, while this flexibility is intrinsic to the RFR method. Analyses of the estimates shows that the SVR and RFR's respective sensitivity and robustness to outliers define the outcome significantly. Further analyses on the methods were performed by using a synthetic dataset to assess the following: which method (RFR or SVR) has the best performance? What is the effect of using time, latitude and longitude as proxy variables on Δ p CO 2 ? What is the impact of the sampling bias in the SOCAT v3 dataset on the estimates? We find that while RFR is indeed better than SVR, the ensemble of the two methods outperforms either one, due to complementary strengths and weaknesses of the methods. Results also show that for the RFR and SVR implementations, it is better to include coordinates as proxy variables as RMSE scores are lowered and the ... Article in Journal/Newspaper Southern Ocean Directory of Open Access Journals: DOAJ Articles Southern Ocean Biogeosciences 14 23 5551 5569
institution Open Polar
collection Directory of Open Access Journals: DOAJ Articles
op_collection_id ftdoajarticles
language English
topic Ecology
QH540-549.5
Life
QH501-531
Geology
QE1-996.5
spellingShingle Ecology
QH540-549.5
Life
QH501-531
Geology
QE1-996.5
L. Gregor
S. Kok
P. M. S. Monteiro
Empirical methods for the estimation of Southern Ocean CO 2 : support vector and random forest regression
topic_facet Ecology
QH540-549.5
Life
QH501-531
Geology
QE1-996.5
description The Southern Ocean accounts for 40 % of oceanic CO 2 uptake, but the estimates are bound by large uncertainties due to a paucity in observations. Gap-filling empirical methods have been used to good effect to approximate p CO 2 from satellite observable variables in other parts of the ocean, but many of these methods are not in agreement in the Southern Ocean. In this study we propose two additional methods that perform well in the Southern Ocean: support vector regression (SVR) and random forest regression (RFR). The methods are used to estimate Δ p CO 2 in the Southern Ocean based on SOCAT v3, achieving similar trends to the SOM-FFN method by Landschützer et al. (2014). Results show that the SOM-FFN and RFR approaches have RMSEs of similar magnitude (14.84 and 16.45 µatm, where 1 atm = 101 325 Pa) where the SVR method has a larger RMSE (24.40 µatm). However, the larger errors for SVR and RFR are, in part, due to an increase in coastal observations from SOCAT v2 to v3, where the SOM-FFN method used v2 data. The success of both SOM-FFN and RFR depends on the ability to adapt to different modes of variability. The SOM-FFN achieves this by having independent regression models for each cluster, while this flexibility is intrinsic to the RFR method. Analyses of the estimates shows that the SVR and RFR's respective sensitivity and robustness to outliers define the outcome significantly. Further analyses on the methods were performed by using a synthetic dataset to assess the following: which method (RFR or SVR) has the best performance? What is the effect of using time, latitude and longitude as proxy variables on Δ p CO 2 ? What is the impact of the sampling bias in the SOCAT v3 dataset on the estimates? We find that while RFR is indeed better than SVR, the ensemble of the two methods outperforms either one, due to complementary strengths and weaknesses of the methods. Results also show that for the RFR and SVR implementations, it is better to include coordinates as proxy variables as RMSE scores are lowered and the ...
format Article in Journal/Newspaper
author L. Gregor
S. Kok
P. M. S. Monteiro
author_facet L. Gregor
S. Kok
P. M. S. Monteiro
author_sort L. Gregor
title Empirical methods for the estimation of Southern Ocean CO 2 : support vector and random forest regression
title_short Empirical methods for the estimation of Southern Ocean CO 2 : support vector and random forest regression
title_full Empirical methods for the estimation of Southern Ocean CO 2 : support vector and random forest regression
title_fullStr Empirical methods for the estimation of Southern Ocean CO 2 : support vector and random forest regression
title_full_unstemmed Empirical methods for the estimation of Southern Ocean CO 2 : support vector and random forest regression
title_sort empirical methods for the estimation of southern ocean co 2 : support vector and random forest regression
publisher Copernicus Publications
publishDate 2017
url https://doi.org/10.5194/bg-14-5551-2017
https://doaj.org/article/81f909781272422d9f780891f66b2513
geographic Southern Ocean
geographic_facet Southern Ocean
genre Southern Ocean
genre_facet Southern Ocean
op_source Biogeosciences, Vol 14, Pp 5551-5569 (2017)
op_relation https://www.biogeosciences.net/14/5551/2017/bg-14-5551-2017.pdf
https://doaj.org/toc/1726-4170
https://doaj.org/toc/1726-4189
doi:10.5194/bg-14-5551-2017
1726-4170
1726-4189
https://doaj.org/article/81f909781272422d9f780891f66b2513
op_doi https://doi.org/10.5194/bg-14-5551-2017
container_title Biogeosciences
container_volume 14
container_issue 23
container_start_page 5551
op_container_end_page 5569
_version_ 1766205604459708416