On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology.

Ecologists are increasingly using statistical models to predict animal abundance and occurrence in unsampled locations. The reliability of such predictions depends on a number of factors, including sample size, how far prediction locations are from the observed data, and similarity of predictive cov...

Full description

Bibliographic Details
Published in:PLOS ONE
Main Authors: Paul B Conn, Devin S Johnson, Peter L Boveng
Format: Article in Journal/Newspaper
Language:English
Published: Public Library of Science (PLoS) 2015
Subjects:
R
Q
Online Access:https://doi.org/10.1371/journal.pone.0141416
https://doaj.org/article/162514be4a3b4c59b40119951565a27f
id ftdoajarticles:oai:doaj.org/article:162514be4a3b4c59b40119951565a27f
record_format openpolar
spelling ftdoajarticles:oai:doaj.org/article:162514be4a3b4c59b40119951565a27f 2023-05-15T15:43:52+02:00 On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology. Paul B Conn Devin S Johnson Peter L Boveng 2015-01-01T00:00:00Z https://doi.org/10.1371/journal.pone.0141416 https://doaj.org/article/162514be4a3b4c59b40119951565a27f EN eng Public Library of Science (PLoS) http://europepmc.org/articles/PMC4619888?pdf=render https://doaj.org/toc/1932-6203 1932-6203 doi:10.1371/journal.pone.0141416 https://doaj.org/article/162514be4a3b4c59b40119951565a27f PLoS ONE, Vol 10, Iss 10, p e0141416 (2015) Medicine R Science Q article 2015 ftdoajarticles https://doi.org/10.1371/journal.pone.0141416 2022-12-31T01:25:28Z Ecologists are increasingly using statistical models to predict animal abundance and occurrence in unsampled locations. The reliability of such predictions depends on a number of factors, including sample size, how far prediction locations are from the observed data, and similarity of predictive covariates in locations where data are gathered to locations where predictions are desired. In this paper, we propose extending Cook's notion of an independent variable hull (IVH), developed originally for application with linear regression models, to generalized regression models as a way to help assess the potential reliability of predictions in unsampled areas. Predictions occurring inside the generalized independent variable hull (gIVH) can be regarded as interpolations, while predictions occurring outside the gIVH can be regarded as extrapolations worthy of additional investigation or skepticism. We conduct a simulation study to demonstrate the usefulness of this metric for limiting the scope of spatial inference when conducting model-based abundance estimation from survey counts. In this case, limiting inference to the gIVH substantially reduces bias, especially when survey designs are spatially imbalanced. We also demonstrate the utility of the gIVH in diagnosing problematic extrapolations when estimating the relative abundance of ribbon seals in the Bering Sea as a function of predictive covariates. We suggest that ecologists routinely use diagnostics such as the gIVH to help gauge the reliability of predictions from statistical models (such as generalized linear, generalized additive, and spatio-temporal regression models). Article in Journal/Newspaper Bering Sea Directory of Open Access Journals: DOAJ Articles Bering Sea PLOS ONE 10 10 e0141416
institution Open Polar
collection Directory of Open Access Journals: DOAJ Articles
op_collection_id ftdoajarticles
language English
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Paul B Conn
Devin S Johnson
Peter L Boveng
On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology.
topic_facet Medicine
R
Science
Q
description Ecologists are increasingly using statistical models to predict animal abundance and occurrence in unsampled locations. The reliability of such predictions depends on a number of factors, including sample size, how far prediction locations are from the observed data, and similarity of predictive covariates in locations where data are gathered to locations where predictions are desired. In this paper, we propose extending Cook's notion of an independent variable hull (IVH), developed originally for application with linear regression models, to generalized regression models as a way to help assess the potential reliability of predictions in unsampled areas. Predictions occurring inside the generalized independent variable hull (gIVH) can be regarded as interpolations, while predictions occurring outside the gIVH can be regarded as extrapolations worthy of additional investigation or skepticism. We conduct a simulation study to demonstrate the usefulness of this metric for limiting the scope of spatial inference when conducting model-based abundance estimation from survey counts. In this case, limiting inference to the gIVH substantially reduces bias, especially when survey designs are spatially imbalanced. We also demonstrate the utility of the gIVH in diagnosing problematic extrapolations when estimating the relative abundance of ribbon seals in the Bering Sea as a function of predictive covariates. We suggest that ecologists routinely use diagnostics such as the gIVH to help gauge the reliability of predictions from statistical models (such as generalized linear, generalized additive, and spatio-temporal regression models).
format Article in Journal/Newspaper
author Paul B Conn
Devin S Johnson
Peter L Boveng
author_facet Paul B Conn
Devin S Johnson
Peter L Boveng
author_sort Paul B Conn
title On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology.
title_short On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology.
title_full On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology.
title_fullStr On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology.
title_full_unstemmed On Extrapolating Past the Range of Observed Data When Making Statistical Predictions in Ecology.
title_sort on extrapolating past the range of observed data when making statistical predictions in ecology.
publisher Public Library of Science (PLoS)
publishDate 2015
url https://doi.org/10.1371/journal.pone.0141416
https://doaj.org/article/162514be4a3b4c59b40119951565a27f
geographic Bering Sea
geographic_facet Bering Sea
genre Bering Sea
genre_facet Bering Sea
op_source PLoS ONE, Vol 10, Iss 10, p e0141416 (2015)
op_relation http://europepmc.org/articles/PMC4619888?pdf=render
https://doaj.org/toc/1932-6203
1932-6203
doi:10.1371/journal.pone.0141416
https://doaj.org/article/162514be4a3b4c59b40119951565a27f
op_doi https://doi.org/10.1371/journal.pone.0141416
container_title PLOS ONE
container_volume 10
container_issue 10
container_start_page e0141416
_version_ 1766378084928323584