How should we aggregate data? Methods accounting for the numerical distributions, with an assessment of aerosol optical depth
Many applications of geophysical data – whether from surface observations, satellite retrievals, or model simulations – rely on aggregates produced at coarser spatial (e.g. degrees) and/or temporal (e.g. daily and monthly) resolution than the highest available from the technique. Almost all of these...
Published in: | Atmospheric Chemistry and Physics |
---|---|
Main Authors: | , |
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
Copernicus Publications
2019
|
Subjects: | |
Online Access: | https://doi.org/10.5194/acp-19-15023-2019 https://doaj.org/article/731825c70f9f46329ae88525fd8fc6ca |
Summary: | Many applications of geophysical data – whether from surface observations, satellite retrievals, or model simulations – rely on aggregates produced at coarser spatial (e.g. degrees) and/or temporal (e.g. daily and monthly) resolution than the highest available from the technique. Almost all of these aggregates report the arithmetic mean and standard deviation as summary statistics, which are what data users employ in their analyses. These statistics are most meaningful for normally distributed data; however, for some quantities, such as aerosol optical depth (AOD), it is well-known that distributions are on large scales closer to log-normal, for which a geometric mean and standard deviation would be more appropriate. This study presents a method of assessing whether a given sample of data is more consistent with an underlying normal or log-normal distribution, using the Shapiro–Wilk test, and tests AOD frequency distributions on spatial scales of 1 ∘ and daily, monthly, and seasonal temporal scales. A broadly consistent picture is observed using Aerosol Robotic Network (AERONET), Multiangle Imaging SpectroRadiometer (MISR), Moderate Resolution Imagining Spectroradiometer (MODIS), and Goddard Earth Observing System Version 5 Nature Run (G5NR) data. These data sets are complementary: AERONET has the highest AOD accuracy but is sparse, and MISR and MODIS represent different satellite retrieval techniques and sampling. As a model simulation, G5NR is spatiotemporally complete. As timescales increase from days to months to seasons, data become increasingly more consistent with log-normal than normal distributions, and the differences between arithmetic- and geometric-mean AOD become larger, with geometric mean becoming systematically smaller. Assuming normality systematically overstates both the typical level of AOD and its variability. There is considerable regional heterogeneity in the results: in low-AOD regions such as the open ocean and mountains, often the AOD difference is small enough ( <0.01 ) to be ... |
---|