Introducing zoid: A mixture model and R package for modeling proportional data with zeros and ones in ecology
Many ecological data sets are proportional, representing mixtures of constituent elements such as species, populations, or strains. Analyses of proportional data are challenged by categories with zero observations (zeros), all observations (ones), and overdispersion. In lieu of ad hoc data adjustmen...
Published in: | Ecology |
---|---|
Main Authors: | , , , , , |
Format: | Article in Journal/Newspaper |
Language: | unknown |
Published: |
eScholarship, University of California
2022
|
Subjects: | |
Online Access: | https://escholarship.org/uc/item/0wz5k03h https://doi.org/10.1002/ecy.3804 |
Summary: | Many ecological data sets are proportional, representing mixtures of constituent elements such as species, populations, or strains. Analyses of proportional data are challenged by categories with zero observations (zeros), all observations (ones), and overdispersion. In lieu of ad hoc data adjustments, we describe and evaluate a zero-and-one inflated Dirichlet regression model, with its corresponding R package (zoid), capable of handling observed data x $$ x $$ consisting of three possible categories: zeros, proportions, or ones. Instead of fitting the model to observations of single biological units (e.g., individual organisms) within a sample, we sum proportional contributions across units and estimate mixture proportions using one aggregated observation per sample. Optional estimation of overdispersion and covariate influences expand model applications. We evaluate model performance, as implemented in Stan, using simulations and two ecological case studies. We show that zoid successfully estimates mixture proportions using simulated data with varying sample sizes and is robust to overdispersion and covariate structure. In empirical case studies, we estimate the composition of a mixed-stock Chinook salmon (Oncorhynchus tshawytscha) fishery and analyze the stomach contents of Atlantic cod (Gadus morhua). Our implementation of the model as an R package facilitates its application to varied ecological data sets composed of proportional observations. |
---|