A statistical comparison of survival and replacement analyses for the use of censored data in a contaminant air database : a case study from the Canadian arctic.

The data sets of four semi-volatile organic compounds (phenanthrene, PCB-28, p,p′-DDE and α-endosulfan) from a multi-year Canadian Arctic air monitoring programme were examined to test the effect of both including and removing censored data (i.e. data that fall below analytical detection limits) on...

Full description

Bibliographic Details
Published in:Atmospheric Environment
Main Authors: Eastoe, Emma F., Halsall, Crispin J., Heffernan, Janet E., Hung, Hayley
Format: Article in Journal/Newspaper
Language:unknown
Published: 2006
Subjects:
Online Access:https://eprints.lancs.ac.uk/id/eprint/26849/
https://doi.org/10.1016/j.atmosenv.2006.05.073
Description
Summary:The data sets of four semi-volatile organic compounds (phenanthrene, PCB-28, p,p′-DDE and α-endosulfan) from a multi-year Canadian Arctic air monitoring programme were examined to test the effect of both including and removing censored data (i.e. data that fall below analytical detection limits) on time-trend models. Two approaches were taken with the data, one that included all censored values, known as a survival analysis, and the other with censored values replaced with a fixed constant, referred to here as a replacement analysis. Initially, the results from the time-trend models (depicting seasonality and year-on-year trends) from the two analyses, where replacement involved a small amount of data that fell below instrumental detection limits, showed very few differences. This was effectively due to the small quantity of censoring apparent in each of the data sets (the data sets of 2 compounds had <10% censoring). However, when the degree of censoring was artificially increased to 50% for two of the compounds (phenanthrene and p,p′-DDE), differences in modelled trend results were evident. By comparing the results of the trend models fitted under both survival and replacement analyses with these highly censored data sets to the actual observed data, it was evident that the survival analysis produced time series models that were far more robust given the quantity of censoring. The application of a Kaplan–Meier (K–M) estimator as a diagnostic tool confirmed the survival analysis approach for producing more robust trend models for both compounds. As a result, we recommend survival analysis and the retention of all censored data within a given data set and this justifies the current approach of retaining all censored data within the Canadian arctic air databases. Blank correction of these types of databases and/or simple exclusion of censored data, could confound time-series analysis.