Using PPCA to estimate EOFS in the presence of missing values

One of the problems encountered when using satellite-derived sea surface temperature (SST) data is the impossibility of retrieving data where the ocean surface is obscured by cloud. Empirical orthogonal function (EOF) analysis cannot be carried out easily when there are missing values within the dat...

Full description

Bibliographic Details
Main Authors: Houseago-Stokes, R.E., Challenor, P.G.
Format: Article in Journal/Newspaper
Language:unknown
Published: 2004
Subjects:
Online Access:https://eprints.soton.ac.uk/9645/
http://ams.allenpress.com/amsonline/?request=get-abstract&issn=1520-0426&volume=021&issue=09&page=1471
id ftsouthampton:oai:eprints.soton.ac.uk:9645
record_format openpolar
spelling ftsouthampton:oai:eprints.soton.ac.uk:9645 2023-07-30T04:05:27+02:00 Using PPCA to estimate EOFS in the presence of missing values Houseago-Stokes, R.E. Challenor, P.G. 2004 https://eprints.soton.ac.uk/9645/ http://ams.allenpress.com/amsonline/?request=get-abstract&issn=1520-0426&volume=021&issue=09&page=1471 unknown Houseago-Stokes, R.E. and Challenor, P.G. (2004) Using PPCA to estimate EOFS in the presence of missing values. Journal of Atmospheric and Oceanic Technology, 21 (9), 1471-1480. (doi:10.1175/1520-0426(2004)021<1471:UPTEEI>2.0.CO;2 <http://dx.doi.org/10.1175/1520-0426(2004)021<1471:UPTEEI>2.0.CO;2>). Article PeerReviewed 2004 ftsouthampton https://doi.org/10.1175/1520-0426(2004)021<1471:UPTEEI>2.0.CO;2 2023-07-09T20:30:11Z One of the problems encountered when using satellite-derived sea surface temperature (SST) data is the impossibility of retrieving data where the ocean surface is obscured by cloud. Empirical orthogonal function (EOF) analysis cannot be carried out easily when there are missing values within the dataset. One possible solution is to interpolate using the existing data. In this paper an alternative technique is investigated, probabilistic principal component analysis (PPCA), and applied to calculate the principal EOFs of North Atlantic SSTs. This analysis uses results obtained from interpolating the SST data using a simplified Kalman filter, with data randomly removed to simulate missing values, and then reconstructs the data using PPCA, obtaining the principal EOFs. The calculation of the EOFs was quicker than traditional EOF analysis, as the covariance matrix was estimated rather than calculated. The replacement of missing values was also computationally more efficient than using the Kalman filter, taking a fraction of the time. The expectation–maximization (EM) algorithm produced similar results to those produced through standard procedures. However, the choice of the number of EOFs to be retained had a significant effect on the accuracy of the interpolated dataset, with more EOFs reducing the accuracy of the reconstructed dataset. Article in Journal/Newspaper North Atlantic University of Southampton: e-Prints Soton
institution Open Polar
collection University of Southampton: e-Prints Soton
op_collection_id ftsouthampton
language unknown
description One of the problems encountered when using satellite-derived sea surface temperature (SST) data is the impossibility of retrieving data where the ocean surface is obscured by cloud. Empirical orthogonal function (EOF) analysis cannot be carried out easily when there are missing values within the dataset. One possible solution is to interpolate using the existing data. In this paper an alternative technique is investigated, probabilistic principal component analysis (PPCA), and applied to calculate the principal EOFs of North Atlantic SSTs. This analysis uses results obtained from interpolating the SST data using a simplified Kalman filter, with data randomly removed to simulate missing values, and then reconstructs the data using PPCA, obtaining the principal EOFs. The calculation of the EOFs was quicker than traditional EOF analysis, as the covariance matrix was estimated rather than calculated. The replacement of missing values was also computationally more efficient than using the Kalman filter, taking a fraction of the time. The expectation–maximization (EM) algorithm produced similar results to those produced through standard procedures. However, the choice of the number of EOFs to be retained had a significant effect on the accuracy of the interpolated dataset, with more EOFs reducing the accuracy of the reconstructed dataset.
format Article in Journal/Newspaper
author Houseago-Stokes, R.E.
Challenor, P.G.
spellingShingle Houseago-Stokes, R.E.
Challenor, P.G.
Using PPCA to estimate EOFS in the presence of missing values
author_facet Houseago-Stokes, R.E.
Challenor, P.G.
author_sort Houseago-Stokes, R.E.
title Using PPCA to estimate EOFS in the presence of missing values
title_short Using PPCA to estimate EOFS in the presence of missing values
title_full Using PPCA to estimate EOFS in the presence of missing values
title_fullStr Using PPCA to estimate EOFS in the presence of missing values
title_full_unstemmed Using PPCA to estimate EOFS in the presence of missing values
title_sort using ppca to estimate eofs in the presence of missing values
publishDate 2004
url https://eprints.soton.ac.uk/9645/
http://ams.allenpress.com/amsonline/?request=get-abstract&issn=1520-0426&volume=021&issue=09&page=1471
genre North Atlantic
genre_facet North Atlantic
op_relation Houseago-Stokes, R.E. and Challenor, P.G. (2004) Using PPCA to estimate EOFS in the presence of missing values. Journal of Atmospheric and Oceanic Technology, 21 (9), 1471-1480. (doi:10.1175/1520-0426(2004)021<1471:UPTEEI>2.0.CO;2 <http://dx.doi.org/10.1175/1520-0426(2004)021<1471:UPTEEI>2.0.CO;2>).
op_doi https://doi.org/10.1175/1520-0426(2004)021<1471:UPTEEI>2.0.CO;2
_version_ 1772817368481792000