How real are observed trends in small correlated datasets?

The eye may perceive a significant trend in plotted time-series data, but if the model errors of nearby data points are correlated, the trend may be an illusion. We examine generalized least-squares (GLS) estimation, finding that error correlation may be underestimated in highly correlated small dat...

Full description

Bibliographic Details
Published in:Royal Society Open Science
Main Authors: Salamon, S. J., Hansen, H. J., Abbott, D.
Other Authors: Department of Education, Employment and Workplace Relations, Australian Government
Format: Article in Journal/Newspaper
Language:English
Published: The Royal Society 2019
Subjects:
Online Access:http://dx.doi.org/10.1098/rsos.181089
https://royalsocietypublishing.org/doi/pdf/10.1098/rsos.181089
https://royalsocietypublishing.org/doi/full-xml/10.1098/rsos.181089
id crroyalsociety:10.1098/rsos.181089
record_format openpolar
spelling crroyalsociety:10.1098/rsos.181089 2024-09-15T17:41:39+00:00 How real are observed trends in small correlated datasets? Salamon, S. J. Hansen, H. J. Abbott, D. Department of Education, Employment and Workplace Relations, Australian Government 2019 http://dx.doi.org/10.1098/rsos.181089 https://royalsocietypublishing.org/doi/pdf/10.1098/rsos.181089 https://royalsocietypublishing.org/doi/full-xml/10.1098/rsos.181089 en eng The Royal Society https://royalsociety.org/journals/ethics-policies/data-sharing-mining/ Royal Society Open Science volume 6, issue 3, page 181089 ISSN 2054-5703 journal-article 2019 crroyalsociety https://doi.org/10.1098/rsos.181089 2024-07-15T04:26:40Z The eye may perceive a significant trend in plotted time-series data, but if the model errors of nearby data points are correlated, the trend may be an illusion. We examine generalized least-squares (GLS) estimation, finding that error correlation may be underestimated in highly correlated small datasets by conventional techniques. This risks indicating a significant trend when there is none. A new correlation estimate based on the Durbin–Watson statistic is developed, leading to an improved estimate of autoregression with highly correlated data, thus reducing this risk. These techniques are generalized to randomly located data points in space, through the new concept of the nearest new neighbour path. We describe tests on the validity of the GLS schemes, allowing verification of the models employed. Examples illustrating our method include a 40-year record of atmospheric carbon dioxide, and Antarctic ice core data. While more conservative than existing techniques, our new GLS estimate finds a statistically significant increase in background carbon dioxide concentration, with an accelerating trend. We conclude with an example of a worldwide empirical climate model for radio propagation studies, to illustrate dealing with spatial correlation in unevenly distributed data points over the surface of the Earth. The method is generally applicable, not only to climate-related data, but to many other kinds of problems (e.g. biological, medical and geological data), where there are unequally (or randomly) spaced observations in temporally or spatially distributed datasets. Article in Journal/Newspaper Antarc* Antarctic ice core The Royal Society Royal Society Open Science 6 3 181089
institution Open Polar
collection The Royal Society
op_collection_id crroyalsociety
language English
description The eye may perceive a significant trend in plotted time-series data, but if the model errors of nearby data points are correlated, the trend may be an illusion. We examine generalized least-squares (GLS) estimation, finding that error correlation may be underestimated in highly correlated small datasets by conventional techniques. This risks indicating a significant trend when there is none. A new correlation estimate based on the Durbin–Watson statistic is developed, leading to an improved estimate of autoregression with highly correlated data, thus reducing this risk. These techniques are generalized to randomly located data points in space, through the new concept of the nearest new neighbour path. We describe tests on the validity of the GLS schemes, allowing verification of the models employed. Examples illustrating our method include a 40-year record of atmospheric carbon dioxide, and Antarctic ice core data. While more conservative than existing techniques, our new GLS estimate finds a statistically significant increase in background carbon dioxide concentration, with an accelerating trend. We conclude with an example of a worldwide empirical climate model for radio propagation studies, to illustrate dealing with spatial correlation in unevenly distributed data points over the surface of the Earth. The method is generally applicable, not only to climate-related data, but to many other kinds of problems (e.g. biological, medical and geological data), where there are unequally (or randomly) spaced observations in temporally or spatially distributed datasets.
author2 Department of Education, Employment and Workplace Relations, Australian Government
format Article in Journal/Newspaper
author Salamon, S. J.
Hansen, H. J.
Abbott, D.
spellingShingle Salamon, S. J.
Hansen, H. J.
Abbott, D.
How real are observed trends in small correlated datasets?
author_facet Salamon, S. J.
Hansen, H. J.
Abbott, D.
author_sort Salamon, S. J.
title How real are observed trends in small correlated datasets?
title_short How real are observed trends in small correlated datasets?
title_full How real are observed trends in small correlated datasets?
title_fullStr How real are observed trends in small correlated datasets?
title_full_unstemmed How real are observed trends in small correlated datasets?
title_sort how real are observed trends in small correlated datasets?
publisher The Royal Society
publishDate 2019
url http://dx.doi.org/10.1098/rsos.181089
https://royalsocietypublishing.org/doi/pdf/10.1098/rsos.181089
https://royalsocietypublishing.org/doi/full-xml/10.1098/rsos.181089
genre Antarc*
Antarctic
ice core
genre_facet Antarc*
Antarctic
ice core
op_source Royal Society Open Science
volume 6, issue 3, page 181089
ISSN 2054-5703
op_rights https://royalsociety.org/journals/ethics-policies/data-sharing-mining/
op_doi https://doi.org/10.1098/rsos.181089
container_title Royal Society Open Science
container_volume 6
container_issue 3
container_start_page 181089
_version_ 1810487879788396544