Assuming independence in spatial latent variable models: Consequences and implications of misspecification

Abstract Multivariate spatial data, where multiple responses are simultaneously recorded across spatially indexed observational units, are routinely collected in a wide variety of disciplines. For example, the Southern Ocean Continuous Plankton Recorder survey collects records of zooplankton communi...

Full description

Bibliographic Details
Published in:Biometrics
Main Authors: Hui, Francis K.C., Hill, Nicole A., Welsh, A.H.
Other Authors: Australian Research Council
Format: Article in Journal/Newspaper
Language:English
Published: Oxford University Press (OUP) 2021
Subjects:
Online Access:http://dx.doi.org/10.1111/biom.13416
https://onlinelibrary.wiley.com/doi/pdf/10.1111/biom.13416
https://onlinelibrary.wiley.com/doi/full-xml/10.1111/biom.13416
https://onlinelibrary.wiley.com/doi/am-pdf/10.1111/biom.13416
Description
Summary:Abstract Multivariate spatial data, where multiple responses are simultaneously recorded across spatially indexed observational units, are routinely collected in a wide variety of disciplines. For example, the Southern Ocean Continuous Plankton Recorder survey collects records of zooplankton communities in the Indian sector of the Southern Ocean, with the aim of identifying and quantifying spatial patterns in biodiversity in response to environmental change. One increasingly popular method for modeling such data is spatial generalized linear latent variable models (GLLVMs), where the correlation across sites is captured by a spatial covariance function in the latent variables. However, little is known about the impact of misspecifying the latent variable correlation structure on inference of various parameters in such models. To address this gap in the literature, we investigate how misspecifying and assuming independence for the latent variables' correlation structure impacts estimation and inference in spatial GLLVMs. Through both theory and numerical studies, we show that performance of maximum likelihood estimation and inference on regression coefficients under misspecification depends on a combination of the response type, the magnitude of true regression coefficient, and the corresponding loadings, and, most importantly, whether the corresponding covariate is (also) spatially correlated. On the other hand, estimation and inference of truly nonzero loadings and prediction of latent variables is consistently not robust to misspecification of the latent variable correlation structure.