3704 JOURNAL OF THE ATMOSPHERIC SCIENCES VOLUME 56 Multiple Regimes in Northern Hemisphere Height Fields via Mixture Model Clustering*

A mixture model is a flexible probability density estimation technique, consisting of a linear combination of k component densities. Such a model is applied to estimate clustering in Northern Hemisphere (NH) 700-mb geopotential height anomalies. A key feature of this approach is its ability to estim...

Full description

Bibliographic Details
Main Author: Padhraic Smyth
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Published: 1997
Subjects:
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.295.8010
http://www.atmos.ucla.edu/~kayo/data/publication/smyth_etal_jas99.pdf
Description
Summary:A mixture model is a flexible probability density estimation technique, consisting of a linear combination of k component densities. Such a model is applied to estimate clustering in Northern Hemisphere (NH) 700-mb geopotential height anomalies. A key feature of this approach is its ability to estimate a posterior probability distribution for k, the number of clusters, given the data and the model. The number of clusters that is most likely to fit the data is thus determined objectively. A dataset of 44 winters of NH 700-mb fields is projected onto its two leading empirical orthogonal functions (EOFs) and analyzed using mixtures of Gaussian components. Cross-validated likelihood is used to determine the best value of k, the number of clusters. The posterior probability so determined peaks at k � 3 and thus yields clear evidence for three clusters in the NH 700-mb data. The three-cluster result is found to be robust with respect to variations in data preprocessing and data analysis parameters. The spatial patterns of the three clusters ’ centroids bear a high degree of qualitative similarity to the three clusters obtained independently by Cheng and Wallace, using hierarchical clustering on 500-mb NH winter data: the Gulf of Alaska ridge, the high over southern Greenland, and the enhanced climatological ridge over the Rockies. Separating the 700-mb data into Pacific (PAC) and Atlantic (ATL) sector maps reveals that the optimal k value is 2 for both the PAC and ATL sectors. The respective clusters consist of Kimoto and Ghil’s Pacific– North American (PNA) and reverse PNA regimes, as well as the zonal and blocked phases of the North Atlantic oscillation. The connections between our sectorial and hemispheric results are discussed from the perspective of large-scale atmospheric dynamics. 1. Introduction and