Outlier accommodation with semiparametric density processes: A study of Antarctic snow density modelling

In many settings, data acquisition generates outliers that can obscure inference. Therefore, practitioners often either identify and remove outliers or accommodate outliers using robust models. However, identifying and removing outliers is often an ad hoc process that affects inference, and robust m...

Full description

Bibliographic Details
Published in:Statistical Modelling
Main Authors: Sheanshang, Daniel M., White, Philip A., Keeler, Durban G.
Format: Article in Journal/Newspaper
Language:English
Published: SAGE Publications 2021
Subjects:
Online Access:http://dx.doi.org/10.1177/1471082x211043946
http://journals.sagepub.com/doi/pdf/10.1177/1471082X211043946
http://journals.sagepub.com/doi/full-xml/10.1177/1471082X211043946
id crsagepubl:10.1177/1471082x211043946
record_format openpolar
spelling crsagepubl:10.1177/1471082x211043946 2024-10-06T13:42:37+00:00 Outlier accommodation with semiparametric density processes: A study of Antarctic snow density modelling Sheanshang, Daniel M. White, Philip A. Keeler, Durban G. 2021 http://dx.doi.org/10.1177/1471082x211043946 http://journals.sagepub.com/doi/pdf/10.1177/1471082X211043946 http://journals.sagepub.com/doi/full-xml/10.1177/1471082X211043946 en eng SAGE Publications http://journals.sagepub.com/page/policies/text-and-data-mining-license Statistical Modelling volume 23, issue 2, page 151-172 ISSN 1471-082X 1477-0342 journal-article 2021 crsagepubl https://doi.org/10.1177/1471082x211043946 2024-09-10T04:24:52Z In many settings, data acquisition generates outliers that can obscure inference. Therefore, practitioners often either identify and remove outliers or accommodate outliers using robust models. However, identifying and removing outliers is often an ad hoc process that affects inference, and robust methods are often too simple for some applications. In our motivating application, scientists drill snow cores and measure snow density to infer densification rates that aid in estimating snow water accumulation rates and glacier mass balances. Advanced measurement techniques can measure density at high resolution over depth but are sensitive to core imperfections, making them prone to outliers. Outlier accommodation is challenging in this setting because the distribution of outliers evolves over depth and the data demonstrate natural heteroscedasticity. To address these challenges, we present a two-component mixture model using a physically motivated snow density model and an outlier model, both of which evolve over depth. The physical component of the mixture model has a mean function with normally distributed depth-dependent heteroscedastic errors. The outlier component is specified using a semiparametric prior density process constructed through a normalized process convolution of log-normal random variables. We demonstrate that this model outperforms alternatives and can be used for various inferential tasks. Article in Journal/Newspaper Antarc* Antarctic SAGE Publications Antarctic Statistical Modelling 1471082X2110439
institution Open Polar
collection SAGE Publications
op_collection_id crsagepubl
language English
description In many settings, data acquisition generates outliers that can obscure inference. Therefore, practitioners often either identify and remove outliers or accommodate outliers using robust models. However, identifying and removing outliers is often an ad hoc process that affects inference, and robust methods are often too simple for some applications. In our motivating application, scientists drill snow cores and measure snow density to infer densification rates that aid in estimating snow water accumulation rates and glacier mass balances. Advanced measurement techniques can measure density at high resolution over depth but are sensitive to core imperfections, making them prone to outliers. Outlier accommodation is challenging in this setting because the distribution of outliers evolves over depth and the data demonstrate natural heteroscedasticity. To address these challenges, we present a two-component mixture model using a physically motivated snow density model and an outlier model, both of which evolve over depth. The physical component of the mixture model has a mean function with normally distributed depth-dependent heteroscedastic errors. The outlier component is specified using a semiparametric prior density process constructed through a normalized process convolution of log-normal random variables. We demonstrate that this model outperforms alternatives and can be used for various inferential tasks.
format Article in Journal/Newspaper
author Sheanshang, Daniel M.
White, Philip A.
Keeler, Durban G.
spellingShingle Sheanshang, Daniel M.
White, Philip A.
Keeler, Durban G.
Outlier accommodation with semiparametric density processes: A study of Antarctic snow density modelling
author_facet Sheanshang, Daniel M.
White, Philip A.
Keeler, Durban G.
author_sort Sheanshang, Daniel M.
title Outlier accommodation with semiparametric density processes: A study of Antarctic snow density modelling
title_short Outlier accommodation with semiparametric density processes: A study of Antarctic snow density modelling
title_full Outlier accommodation with semiparametric density processes: A study of Antarctic snow density modelling
title_fullStr Outlier accommodation with semiparametric density processes: A study of Antarctic snow density modelling
title_full_unstemmed Outlier accommodation with semiparametric density processes: A study of Antarctic snow density modelling
title_sort outlier accommodation with semiparametric density processes: a study of antarctic snow density modelling
publisher SAGE Publications
publishDate 2021
url http://dx.doi.org/10.1177/1471082x211043946
http://journals.sagepub.com/doi/pdf/10.1177/1471082X211043946
http://journals.sagepub.com/doi/full-xml/10.1177/1471082X211043946
geographic Antarctic
geographic_facet Antarctic
genre Antarc*
Antarctic
genre_facet Antarc*
Antarctic
op_source Statistical Modelling
volume 23, issue 2, page 151-172
ISSN 1471-082X 1477-0342
op_rights http://journals.sagepub.com/page/policies/text-and-data-mining-license
op_doi https://doi.org/10.1177/1471082x211043946
container_title Statistical Modelling
container_start_page 1471082X2110439
_version_ 1812176325642813440