Modeling joint abundance of multiple species using Dirichlet process mixtures

We present a method for modeling the distributions of multiple species simultaneously using Dirichlet process random effects to cluster species into guilds. Guilds are ecological groups of species that behave or react similarly to some environmental conditions. By modeling latent guild structure, we...

Full description

Bibliographic Details
Published in:Environmetrics
Main Authors: Devin S. Johnson, Elizabeth H. Sinclair
Format: Article in Journal/Newspaper
Language:unknown
Subjects:
Online Access:https://doi.org/10.1002/env.2440
Description
Summary:We present a method for modeling the distributions of multiple species simultaneously using Dirichlet process random effects to cluster species into guilds. Guilds are ecological groups of species that behave or react similarly to some environmental conditions. By modeling latent guild structure, we capture the cross‐correlations in abundance or occurrence of species over surveys. In addition, ecological information about the community structure is obtained as a by‐product of the model. By clustering species into similar functional groups, prediction uncertainty of community structure at additional sites is reduced over treating each species separately. The proposed model also presents an improvement over previously proposed joint species distribution models by reducing the number of parameters necessary to capture interspecies correlations and eliminating the need to have a priori information on the number of groups or a distance metric over species traits. The method is illustrated with a small simulation demonstration, as well as an analysis of a mesopelagic fish survey from the eastern Bering Sea near Alaska. The simulation data analysis shows that guild membership can be extracted as the differences between groups become larger and if guild differences are small, the model naturally collapses all the species into a small number of guilds, which increases predictive efficiency by reducing the number of parameters to that which is supported by the data.