Evaluation of Machine Learning Techniques for Estimating Biogeochemical Properties on Seaglider Tracks in the Southern Ocean

The Southern Ocean is the largest oceanic carbon and heat sink on the planet with complex dynamics at a variety of scales. Reliable, accurate, and high resolution estimates of nitrate and carbonate system parameters (hereafter biogeochemical estimates) in the Southern Ocean would enable the analysis...

Full description

Bibliographic Details
Main Author: Nachod, Zachary
Language:unknown
Published: 2022
Subjects:
Online Access:http://hdl.handle.net/1773/48384
Description
Summary:The Southern Ocean is the largest oceanic carbon and heat sink on the planet with complex dynamics at a variety of scales. Reliable, accurate, and high resolution estimates of nitrate and carbonate system parameters (hereafter biogeochemical estimates) in the Southern Ocean would enable the analysis of mesoscale and submesoscale biogeochemical processes throughout the water column. This work explores the use of multiple methods, including several from the machine learning literature, for biogeochemical parameter estimation in the Southern Ocean. Training data for this work includes temperature, salinity, oxygen, and nitrate measurements from the 2019 R/V Thomas G. Thompson reoccupation of the I06S line and from Southern Ocean Carbon and Climate Observations and Modeling project (SOCCOM) floats deployed during this cruise. Four models for the estimation of nitrate were trained and validated for accuracy; these models included a random forest regression, a generalized additive model, a multiple linear regression, and a gradient boosted regression tree model. The random forest regression performed the best out of the four machine and statistical models on our nitrate test data with a median value for the absolute error of 0.09 μmol kg-1 and an interquartile range of 0.13 μmol kg-1 in the absolute error. Using this random forest model, we predicted the nitrate concentrations along the high resolution tracks of two Seagliders deployed on this cruise. We plan to later repeat this estimation process for pH along the Seaglider tracks as well. The nitrate and pH estimates from the random forest model can be used to improve our understanding of mesoscale and submesoscale processes related to carbon flux in this region of the Southern Ocean. Nachod