Data and code to replicate: Diet analysis using generalized linear models derived from foraging processes using R package mvtweedie

Diet analysis integrates a wide variety of visual, chemical and biological identification of prey. Samples are often treated as compositional data, where each prey is analyzed as a continuous percentage of the total. However, analyzing compositional data results in analytical challenges, e.g., highl...

Full description

Bibliographic Details
Main Authors: Thorson, James T., Arimitsu, Mayumi L., Levi, Taal, Roffler, Gretchen H.
Format: Software
Language:unknown
Published: Zenodo 2021
Subjects:
Online Access:https://dx.doi.org/10.5281/zenodo.5579712
https://zenodo.org/record/5579712
Description
Summary:Diet analysis integrates a wide variety of visual, chemical and biological identification of prey. Samples are often treated as compositional data, where each prey is analyzed as a continuous percentage of the total. However, analyzing compositional data results in analytical challenges, e.g., highly parameterized models or prior transformation of data. Here, we present a novel approximation involving a Tweedie generalized linear model (GLM). We first review how this approximation emerges from considering predator foraging as a thinned and marked point process (with marks representing prey species and individual prey size). This derivation can motivate future theoretical and applied developments. We then provide a practical tutorial for the Tweedie GLM using new package mvtweedie that extends capabilities of widely used packages in R ( mgcv and ggplot2 ) by transforming output to calculate prey compositions. We demonstrate this approach and software using two examples. Tufted puffins ( Fratercula cirrhata ) provisioning their chicks on a colony in the northern Gulf of Alaska show decadal prey switching among sand lance and prowfish (1980-2000) and then Pacific herring and capelin (2000-2020), while wolves ( Canis lupus ligoni ) in Southeast Alaska forage on mountain goats and marmots in northern uplands and marine mammals in seaward island coastlines. : File list Reproducible_script_R1.R Wolf.csv Seabird.csv MDO.seabirdforagingarea.SST.csv Description Reproducible_script_R1.R – R script used to replicate all analysis and figures in main text and appendices. See comments at top for directions prior to running. Wolf.csv - CSV file containing four columns used in the wolf metabarcoding case-study in Fig. 3 of the main text: "Latitude" -- Latitude of scat sample in Degree-decimals; "Longitude" -- Longitude of scat sample; "group" -- prey taxonomic group used in analysis; "Response" -- metabarcoding read count used as response variable. Seabird.csv - CSV file containing three columns used in the seabird bill-load case-study in Fig. 2 of the main text: "Year" – Year AD for bill-load sample; "group" -- prey taxonomic group used in analysis; "Response" – bill-load count used as response variable. MDO.seabirdforagingarea.SST.csv - CSV file containing two additional columns used in the seabird bill-load case-study in Fig. 2 of the main text: "Year" – Year AD, including all Years used in Fig. 2; "SST_mean" – average sea surface temperature near Middleton Island;