Nonlinea principal component analysis of climate data

A nonlinear generalisation of Principal Component Analysis (PCA), denoted Nonlinear Principal Component Analysis (NLPCA), is introduced and applied to the analysis of climate data. This method is implemented using a 5-layer feed-forward neural network introduced originally in the chemical engineerin...

Full description

Bibliographic Details
Main Author: Monahan, Adam Hugh
Format: Thesis
Language:English
Published: 2000
Subjects:
Online Access:http://hdl.handle.net/2429/10829
id ftunivbritcolcir:oai:circle.library.ubc.ca:2429/10829
record_format openpolar
institution Open Polar
collection University of British Columbia: cIRcle - UBC's Information Repository
op_collection_id ftunivbritcolcir
language English
description A nonlinear generalisation of Principal Component Analysis (PCA), denoted Nonlinear Principal Component Analysis (NLPCA), is introduced and applied to the analysis of climate data. This method is implemented using a 5-layer feed-forward neural network introduced originally in the chemical engineering literature. The method is described and details of its implementation are addressed. It is found empirically that NLPCA partitions variance in the same fashion as does PCA, that is, that the sum of the total variance of the NLPCA approximation with the total variance of the residual from the original data is equal to the total variance of the original data. An important distinction is drawn between a modal P-dimensional NLPCA analysis, in which P successive 1D approximations are determined iteratively so that the approximation is the sum of P nonlinear functions of one variable, and a nonmodal analysis, in which the P-dimensional NLPCA approximation is determined as a nonlinear non-additive function of P variables. Nonlinear Principal Component Analysis is first applied to a data set sampled from the Lorenz attractor. It is found that the NLPCA approximations are much more representative of the data than are the corresponding PCA approximations. In particular, the 1D and 2D NLPCA approximations explain 76% and 99.5% of the total variance, respectively, in contrast to 60% and 95% explained by the 1D and 2D PCA approximations. When applied to a data set consisting of monthly-averaged tropical Pacific Ocean sea surface temperatures (SST), the modal 1D NLPCA approximation describes average variability associated with the El Nino/Southern Oscillation (ENSO) phenomenon, as does the 1D PCA approximation. The NLPCA approximation, however, characterises the asymmetry in spatial pattern of SST anomalies between average warm and cold events (manifested in the skewness of the distribution) in a manner that the PCA approximation cannot. The second NLPCA mode of SST is found to characterise differences in ENSO variability between individual events, and in particular is consistent with the celebrated 1977 "regime shift". A 2D nonmodal NLPCA approximation is determined, the interpretation of which is complicated by the fact that a secondary feature extraction problem has to be carried out to interpret the results. It is found that this approximation contains much the same information as that provided by the modal analysis. A modal NLPC analysis of tropical Indo-Pacific sea level pressure (SLP) finds that the first mode describes average ENSO variability in this field, and also characterises an asymmetry in SLP fields between average warm and cold events. No robust nonlinear mode beyond the first could be found. Nonlinear Principal Component Analysis is used to find the optimal nonlinear approximation to SLP data produced by a 1001 year integration of the Canadian Centre for Climate Modelling and Analysis (CCCma) coupled general circulation model (CGCM1). This approximation's associated time series is strongly bimodal and partitions the data into two distinct regimes. The first and more persistent regime describes a standing oscillation whose signature in the mid-troposphere is alternating amplification and attenuation of the climatological ridge over Northern Europe. The second and more episodic regime describes mid-tropospheric split-flow south of Greenland. Essentially the same structure is found in the 1D NLPCA approximation of the 500mb height field itself. In a 500 year integration with atmospheric CO2 at four times pre-industrial concentrations, the occupation statistics of these preferred modes of variability change, such that the episodic split-flow regime occurs less frequently while the standing oscillation regime occurs more frequently. Finally, a generalisation of Kramer’s NLPCA using a 7-layer autoassociative neural network is introduced to address the inability of Kramer’s original network to find P-dimensional structure topologically different from the unit cube in RP. The example of an ellipse is considered, and it is shown that the approximation produced by the 7-layer network is a substantial improvement over that produced by the 5-layer network. [Scientific formulae used in this abstract could not be reproduced.] Science, Faculty of Earth, Ocean and Atmospheric Sciences, Department of Graduate
format Thesis
author Monahan, Adam Hugh
spellingShingle Monahan, Adam Hugh
Nonlinea principal component analysis of climate data
author_facet Monahan, Adam Hugh
author_sort Monahan, Adam Hugh
title Nonlinea principal component analysis of climate data
title_short Nonlinea principal component analysis of climate data
title_full Nonlinea principal component analysis of climate data
title_fullStr Nonlinea principal component analysis of climate data
title_full_unstemmed Nonlinea principal component analysis of climate data
title_sort nonlinea principal component analysis of climate data
publishDate 2000
url http://hdl.handle.net/2429/10829
geographic Greenland
Pacific
geographic_facet Greenland
Pacific
genre Greenland
genre_facet Greenland
op_rights For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
_version_ 1766020525508788224
spelling ftunivbritcolcir:oai:circle.library.ubc.ca:2429/10829 2023-05-15T16:30:47+02:00 Nonlinea principal component analysis of climate data Monahan, Adam Hugh 2000 8824754 bytes application/pdf http://hdl.handle.net/2429/10829 eng eng For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. Text Thesis/Dissertation 2000 ftunivbritcolcir 2019-10-15T17:49:03Z A nonlinear generalisation of Principal Component Analysis (PCA), denoted Nonlinear Principal Component Analysis (NLPCA), is introduced and applied to the analysis of climate data. This method is implemented using a 5-layer feed-forward neural network introduced originally in the chemical engineering literature. The method is described and details of its implementation are addressed. It is found empirically that NLPCA partitions variance in the same fashion as does PCA, that is, that the sum of the total variance of the NLPCA approximation with the total variance of the residual from the original data is equal to the total variance of the original data. An important distinction is drawn between a modal P-dimensional NLPCA analysis, in which P successive 1D approximations are determined iteratively so that the approximation is the sum of P nonlinear functions of one variable, and a nonmodal analysis, in which the P-dimensional NLPCA approximation is determined as a nonlinear non-additive function of P variables. Nonlinear Principal Component Analysis is first applied to a data set sampled from the Lorenz attractor. It is found that the NLPCA approximations are much more representative of the data than are the corresponding PCA approximations. In particular, the 1D and 2D NLPCA approximations explain 76% and 99.5% of the total variance, respectively, in contrast to 60% and 95% explained by the 1D and 2D PCA approximations. When applied to a data set consisting of monthly-averaged tropical Pacific Ocean sea surface temperatures (SST), the modal 1D NLPCA approximation describes average variability associated with the El Nino/Southern Oscillation (ENSO) phenomenon, as does the 1D PCA approximation. The NLPCA approximation, however, characterises the asymmetry in spatial pattern of SST anomalies between average warm and cold events (manifested in the skewness of the distribution) in a manner that the PCA approximation cannot. The second NLPCA mode of SST is found to characterise differences in ENSO variability between individual events, and in particular is consistent with the celebrated 1977 "regime shift". A 2D nonmodal NLPCA approximation is determined, the interpretation of which is complicated by the fact that a secondary feature extraction problem has to be carried out to interpret the results. It is found that this approximation contains much the same information as that provided by the modal analysis. A modal NLPC analysis of tropical Indo-Pacific sea level pressure (SLP) finds that the first mode describes average ENSO variability in this field, and also characterises an asymmetry in SLP fields between average warm and cold events. No robust nonlinear mode beyond the first could be found. Nonlinear Principal Component Analysis is used to find the optimal nonlinear approximation to SLP data produced by a 1001 year integration of the Canadian Centre for Climate Modelling and Analysis (CCCma) coupled general circulation model (CGCM1). This approximation's associated time series is strongly bimodal and partitions the data into two distinct regimes. The first and more persistent regime describes a standing oscillation whose signature in the mid-troposphere is alternating amplification and attenuation of the climatological ridge over Northern Europe. The second and more episodic regime describes mid-tropospheric split-flow south of Greenland. Essentially the same structure is found in the 1D NLPCA approximation of the 500mb height field itself. In a 500 year integration with atmospheric CO2 at four times pre-industrial concentrations, the occupation statistics of these preferred modes of variability change, such that the episodic split-flow regime occurs less frequently while the standing oscillation regime occurs more frequently. Finally, a generalisation of Kramer’s NLPCA using a 7-layer autoassociative neural network is introduced to address the inability of Kramer’s original network to find P-dimensional structure topologically different from the unit cube in RP. The example of an ellipse is considered, and it is shown that the approximation produced by the 7-layer network is a substantial improvement over that produced by the 5-layer network. [Scientific formulae used in this abstract could not be reproduced.] Science, Faculty of Earth, Ocean and Atmospheric Sciences, Department of Graduate Thesis Greenland University of British Columbia: cIRcle - UBC's Information Repository Greenland Pacific