Ocean Heat Content Structure Revealed by Un-Supervised Classification of Hydrographic Profiles

International audience In the data mining community, unsupervised classification (or clustering) technics are used to reveal and explore the hidden structure of a dataset. Among them, mixture models and in particular Gaussian Mixture Models (GMM), are a very popular tool. With GMM, the statistical d...

Full description

Bibliographic Details
Main Authors: Maze, Guillaume, Mercier, Herlé, Fablet, Ronan, Lenca, Philippe, Lopez Radcenco, Manuel, Tandeo, Pierre, Le Goff, Clement, Feucher, Charlène
Other Authors: Laboratoire de physique des océans (LPO), Institut de Recherche pour le Développement (IRD)-Institut Français de Recherche pour l'Exploitation de la Mer (IFREMER)-Université de Brest (UBO)-Centre National de la Recherche Scientifique (CNRS), Lab-STICC_TB_CID_TOMS, Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance (Lab-STICC), Université européenne de Bretagne - European University of Brittany (UEB)-École Nationale d'Ingénieurs de Brest (ENIB)-Université de Bretagne Sud (UBS)-Université de Brest (UBO)-Télécom Bretagne-Institut Brestois du Numérique et des Mathématiques (IBNM), Université de Brest (UBO)-École Nationale Supérieure de Techniques Avancées Bretagne (ENSTA Bretagne)-Institut Mines-Télécom Paris (IMT)-Centre National de la Recherche Scientifique (CNRS)-Université européenne de Bretagne - European University of Brittany (UEB)-École Nationale d'Ingénieurs de Brest (ENIB)-Université de Bretagne Sud (UBS)-Université de Brest (UBO)-Télécom Bretagne-Institut Brestois du Numérique et des Mathématiques (IBNM), Université de Brest (UBO)-École Nationale Supérieure de Techniques Avancées Bretagne (ENSTA Bretagne)-Institut Mines-Télécom Paris (IMT)-Centre National de la Recherche Scientifique (CNRS), Département Signal et Communications (SC), Université européenne de Bretagne - European University of Brittany (UEB)-Télécom Bretagne-Institut Mines-Télécom Paris (IMT), Lab-STICC_TB_CID_DECIDE, Département Logique des Usages, Sciences sociales et Sciences de l'Information (LUSSI)
Format: Conference Object
Language:French
Published: HAL CCSD 2016
Subjects:
Online Access:https://hal.science/hal-01345101
Description
Summary:International audience In the data mining community, unsupervised classification (or clustering) technics are used to reveal and explore the hidden structure of a dataset. Among them, mixture models and in particular Gaussian Mixture Models (GMM), are a very popular tool. With GMM, the statistical distribution of the dataset is decomposed into a weighted sum of Gaussian modes that maximizes the likelihood of the dataset. Multivariate Gaussian distributions characterize a mode in the D-dimensional space of the dataset. This technic is routinely used in atmospheric science to describe weather regimes but is yet to be explored in physical oceanography. Here, we are interested in determining the structure of the North Atlantic ocean temperature field. Argo profiles in the North atlantic were compressed along their vertical axis in order to reduce the dimensionality of the problem. We then fitted GMMs to this reduced profile dataset according to a Maximum Likelihood criterion using the Expectation-Maximization algorithm. A seven-mode GMM was the most relevant to describe the dataset. Each mode or cluster is characterized by the parameters of a Gaussian distribution (a mean and a covariance matrix). GMM also provides for each profile of the dataset the probability it belongs to a specific cluster. We will show that these informations can be used to describe physically coherent heat reservoirs and their variability. Indeed, we found that clusters capture the large scale climatological structure of the temperature field. Each of the cluster correspond to physically coherent regions, namely the equatorial, tropical, subtropical, intergyre and subpolar regions and are associated with reference profiles. A hierarchical clustering was applied to characterize the regional variability of the dataset. We will present those clusters and possible use both for scientific analysis of the heat content variability in the North Atlantic and technical validation of the Argo array.