Automated identification of characteristic droplet size distributions in stratocumulus clouds utilizing a data clustering algorithm

Droplet-level interactions in clouds are often parameterized by a modified gamma fitted to a “global” droplet size distribution. Do “local” droplet size distributions of relevance to microphysical processes look like these average distributions? This paper describes an algorithm to search and classi...

Full description

Bibliographic Details
Published in:Artificial Intelligence for the Earth Systems
Main Authors: Allwayin, Nithin, Larsen, Michael L., Shaw, Alexander G., Shaw, Raymond A.
Language:unknown
Published: 2022
Subjects:
Online Access:http://www.osti.gov/servlets/purl/1884127
https://www.osti.gov/biblio/1884127
https://doi.org/10.1175/aies-d-22-0003.1
Description
Summary:Droplet-level interactions in clouds are often parameterized by a modified gamma fitted to a “global” droplet size distribution. Do “local” droplet size distributions of relevance to microphysical processes look like these average distributions? This paper describes an algorithm to search and classify characteristic size distributions within a cloud. The approach combines hypothesis testing, specifically the Kolmogorov-Smirnov (KS) test, and a widely-used class of machine-learning algorithms for identifying clusters of samples with similar properties: Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is used as the specific example for illustration. The two-sample KS test does not presume any specific distribution, is parameter free, and avoids biases from binning. Importantly, the number of clusters is not an input parameter of the DBSCAN-type algorithms, but is independently determined in an unsupervised fashion. As implemented, it works on an abstract space from the KS test results, and hence spatial correlation is not required for a cluster. The method is explored using data obtained from Holographic Detector for Clouds (HOLODEC) deployed during the Aerosol and Cloud Experiments in the Eastern North Atlantic (ACE-ENA) field campaign. The algorithm identifies evidence of the existence of clusters of nearly-identical local size distributions. It is found that cloud segments have as few as one and as many as seven characteristic size distributions. To validate the algorithm’s robustness, it is tested on a synthetic dataset and successfully identifies the predefined distributions at plausible noise levels. The algorithm is general and is expected to be useful in other applications, such as remote sensing of cloud and rain properties.