LGHAP: the Long-term Gap-free High-resolution Air Pollutant concentration dataset, derived via tensor-flow-based multimodal data fusion

Developing a big data analytics framework for generating the Long-term Gap-free High-resolution Air Pollutant concentration dataset (abbreviated as LGHAP) is of great significance for environmental management and Earth system science analysis. By synergistically integrating multimodal aerosol data a...

Full description

Bibliographic Details
Published in:Earth System Science Data
Main Authors: K. Bai, K. Li, M. Ma, Z. Li, J. Guo, N.-B. Chang, Z. Tan, D. Han
Format: Article in Journal/Newspaper
Language:English
Published: Copernicus Publications 2022
Subjects:
Online Access:https://doi.org/10.5194/essd-14-907-2022
https://doaj.org/article/ee01206c5e86419a972b66798f328b8e
Description
Summary:Developing a big data analytics framework for generating the Long-term Gap-free High-resolution Air Pollutant concentration dataset (abbreviated as LGHAP) is of great significance for environmental management and Earth system science analysis. By synergistically integrating multimodal aerosol data acquired from diverse sources via a tensor-flow-based data fusion method, a gap-free aerosol optical depth (AOD) dataset with a daily 1 km resolution covering the period of 2000–2020 in China was generated. Specifically, data gaps in daily AOD imageries from the Moderate Resolution Imaging Spectroradiometer (MODIS) aboard Terra were reconstructed based on a set of AOD data tensors acquired from diverse satellites, numerical analysis, and in situ air quality measurements via integrative efforts of spatial pattern recognition for high-dimensional gridded image analysis and knowledge transfer in statistical data mining. To our knowledge, this is the first long-term gap-free high-resolution AOD dataset in China, from which spatially contiguous PM 2.5 and PM 10 concentrations were then estimated using an ensemble learning approach. Ground validation results indicate that the LGHAP AOD data are in good agreement with in situ AOD observations from the Aerosol Robotic Network (AERONET), with an R of 0.91 and RMSE equaling 0.21. Meanwhile, PM 2.5 and PM 10 estimations also agreed well with ground measurements, with R values of 0.95 and 0.94 and RMSEs of 12.03 and 19.56 µ g m −3 , respectively. The LGHAP provides a suite of long-term gap-free gridded maps with a high resolution to better examine aerosol changes in China over the past 2 decades, from which three major variation periods of haze pollution in China were revealed. Additionally, the proportion of the population exposed to unhealthy PM 2.5 increased from 50.60 % in 2000 to 63.81 % in 2014 across China, which was then reduced drastically to 34.03 % in 2020. Overall, the generated LGHAP dataset has great potential to trigger multidisciplinary applications in Earth ...