Using PPCA to estimate EOFS in the presence of missing values

One of the problems encountered when using satellite-derived sea surface temperature (SST) data is the impossibility of retrieving data where the ocean surface is obscured by cloud. Empirical orthogonal function (EOF) analysis cannot be carried out easily when there are missing values within the dat...

Full description

Bibliographic Details
Main Authors: Houseago-Stokes, R.E., Challenor, P.G.
Format: Article in Journal/Newspaper
Language:unknown
Published: 2004
Subjects:
Online Access:http://nora.nerc.ac.uk/id/eprint/109645/
http://ams.allenpress.com/amsonline/?request=get-abstract&issn=1520-0426&volume=021&issue=09&page=1471
https://doi.org/10.1175/1520-0426(2004)021<1471:UPTEEI>2.0.CO;2
id ftnerc:oai:nora.nerc.ac.uk:109645
record_format openpolar
spelling ftnerc:oai:nora.nerc.ac.uk:109645 2023-05-15T17:33:25+02:00 Using PPCA to estimate EOFS in the presence of missing values Houseago-Stokes, R.E. Challenor, P.G. 2004 http://nora.nerc.ac.uk/id/eprint/109645/ http://ams.allenpress.com/amsonline/?request=get-abstract&issn=1520-0426&volume=021&issue=09&page=1471 https://doi.org/10.1175/1520-0426(2004)021<1471:UPTEEI>2.0.CO;2 unknown Houseago-Stokes, R.E.; Challenor, P.G. 2004 Using PPCA to estimate EOFS in the presence of missing values. Journal of Atmospheric and Oceanic Technology, 21 (9). 1471-1480. https://doi.org/10.1175/1520-0426(2004)021<1471:UPTEEI>2.0.CO;2 <https://doi.org/10.1175/1520-0426(2004)021<1471:UPTEEI>2.0.CO;2> Publication - Article PeerReviewed 2004 ftnerc https://doi.org/10.1175/1520-0426(2004)021<1471:UPTEEI>2.0.CO;2 2023-02-04T19:33:53Z One of the problems encountered when using satellite-derived sea surface temperature (SST) data is the impossibility of retrieving data where the ocean surface is obscured by cloud. Empirical orthogonal function (EOF) analysis cannot be carried out easily when there are missing values within the dataset. One possible solution is to interpolate using the existing data. In this paper an alternative technique is investigated, probabilistic principal component analysis (PPCA), and applied to calculate the principal EOFs of North Atlantic SSTs. This analysis uses results obtained from interpolating the SST data using a simplified Kalman filter, with data randomly removed to simulate missing values, and then reconstructs the data using PPCA, obtaining the principal EOFs. The calculation of the EOFs was quicker than traditional EOF analysis, as the covariance matrix was estimated rather than calculated. The replacement of missing values was also computationally more efficient than using the Kalman filter, taking a fraction of the time. The expectation–maximization (EM) algorithm produced similar results to those produced through standard procedures. However, the choice of the number of EOFs to be retained had a significant effect on the accuracy of the interpolated dataset, with more EOFs reducing the accuracy of the reconstructed dataset. Article in Journal/Newspaper North Atlantic Natural Environment Research Council: NERC Open Research Archive
institution Open Polar
collection Natural Environment Research Council: NERC Open Research Archive
op_collection_id ftnerc
language unknown
description One of the problems encountered when using satellite-derived sea surface temperature (SST) data is the impossibility of retrieving data where the ocean surface is obscured by cloud. Empirical orthogonal function (EOF) analysis cannot be carried out easily when there are missing values within the dataset. One possible solution is to interpolate using the existing data. In this paper an alternative technique is investigated, probabilistic principal component analysis (PPCA), and applied to calculate the principal EOFs of North Atlantic SSTs. This analysis uses results obtained from interpolating the SST data using a simplified Kalman filter, with data randomly removed to simulate missing values, and then reconstructs the data using PPCA, obtaining the principal EOFs. The calculation of the EOFs was quicker than traditional EOF analysis, as the covariance matrix was estimated rather than calculated. The replacement of missing values was also computationally more efficient than using the Kalman filter, taking a fraction of the time. The expectation–maximization (EM) algorithm produced similar results to those produced through standard procedures. However, the choice of the number of EOFs to be retained had a significant effect on the accuracy of the interpolated dataset, with more EOFs reducing the accuracy of the reconstructed dataset.
format Article in Journal/Newspaper
author Houseago-Stokes, R.E.
Challenor, P.G.
spellingShingle Houseago-Stokes, R.E.
Challenor, P.G.
Using PPCA to estimate EOFS in the presence of missing values
author_facet Houseago-Stokes, R.E.
Challenor, P.G.
author_sort Houseago-Stokes, R.E.
title Using PPCA to estimate EOFS in the presence of missing values
title_short Using PPCA to estimate EOFS in the presence of missing values
title_full Using PPCA to estimate EOFS in the presence of missing values
title_fullStr Using PPCA to estimate EOFS in the presence of missing values
title_full_unstemmed Using PPCA to estimate EOFS in the presence of missing values
title_sort using ppca to estimate eofs in the presence of missing values
publishDate 2004
url http://nora.nerc.ac.uk/id/eprint/109645/
http://ams.allenpress.com/amsonline/?request=get-abstract&issn=1520-0426&volume=021&issue=09&page=1471
https://doi.org/10.1175/1520-0426(2004)021<1471:UPTEEI>2.0.CO;2
genre North Atlantic
genre_facet North Atlantic
op_relation Houseago-Stokes, R.E.; Challenor, P.G. 2004 Using PPCA to estimate EOFS in the presence of missing values. Journal of Atmospheric and Oceanic Technology, 21 (9). 1471-1480. https://doi.org/10.1175/1520-0426(2004)021<1471:UPTEEI>2.0.CO;2 <https://doi.org/10.1175/1520-0426(2004)021<1471:UPTEEI>2.0.CO;2>
op_doi https://doi.org/10.1175/1520-0426(2004)021<1471:UPTEEI>2.0.CO;2
_version_ 1766131920219930624