Data mining for the discovery of ocean climate indices

Ocean climate indices (OCIs), which are time series that summarize the behavior of selected areas of the Earth’s oceans, are important tools for predicting the effect of the oceans on land climate. In this paper we describe the use of data mining to discover Ocean Climate Indices (OCIs). In particul...

Full description

Bibliographic Details
Main Authors: Michael Steinbach, Steven Klooster, Pang-ning Tan, Christopher Potter
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Published: 2002
Subjects:
Soi
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.421.7719
http://www-users.cs.umn.edu/~kumar/papers/sdm-oci-final.pdf
Description
Summary:Ocean climate indices (OCIs), which are time series that summarize the behavior of selected areas of the Earth’s oceans, are important tools for predicting the effect of the oceans on land climate. In this paper we describe the use of data mining to discover Ocean Climate Indices (OCIs). In particular, we apply a shared nearest neighbor (SNN) clustering algorithm to cluster the pressure and temperature time series associated with points on the ocean, yielding clusters that represent ocean regions with relatively homogeneous behavior. The centroids of these clusters are time series that summarize the behavior of these ocean areas, and thus, represent potential OCIs. To evaluate cluster centroids for their usefulness as potential OCIs, we must determine which cluster centroids significantly influence the behavior of welldefined land areas. For this task, we use a variety of approaches that analyze the correlation between potential OCIs and the time series (e.g., of temperature or precipitation) which describe the behavior of land points. Based on these approaches, we have identified some cluster centroids that are almost identical to well-known OCIs, e.g., the Southern Oscillation Index (SOI) and the North Atlantic Oscillation (NAO). We also introduce two strategies for validating potential OCIs which do not correspond to well-known (and probably “stronger”) OCIs, namely, focusing on the correlation between “extreme ” events on the ocean and land and looking for more persistent patterns of correlation.