Data from: How to best threshold and validate stacked species assemblages? Community optimisation might hold the answer

PLEASE NOTE, THESE DATA ARE ALSO REFERRED TO IN TWO OTHER PUBLICATIONS. PLEASE SEE DOI: 10.1111/ddi.12548 AND https://doi.org/10.1111/geb.12357 --- The popularity of species distribution models (SDMs) and the associated stacked species distribution models (S‐SDMs), as tools for community ecologists,...

Full description

Bibliographic Details
Main Authors: Scherrer, Daniel, D'Amen, Manuela, Fernandes, Rui F., Mateo, Rubén G., Guisan, Antoine
Format: Dataset
Language:English
Published: Dryad 2018
Subjects:
Online Access:https://dx.doi.org/10.5061/dryad.nf925ps
http://datadryad.org/stash/dataset/doi:10.5061/dryad.nf925ps
Description
Summary:PLEASE NOTE, THESE DATA ARE ALSO REFERRED TO IN TWO OTHER PUBLICATIONS. PLEASE SEE DOI: 10.1111/ddi.12548 AND https://doi.org/10.1111/geb.12357 --- The popularity of species distribution models (SDMs) and the associated stacked species distribution models (S‐SDMs), as tools for community ecologists, largely increased in recent years. However, while some consensus was reached about the best methods to threshold and evaluate individual SDMs, little agreement exists on how to best assemble individual SDMs into communities, that is, how to build and assess S‐SDM predictions. Here, we used published data of insects and plants collected within the same study region to test (a) if the most established thresholding methods to optimize single species prediction are also the best choice for predicting species assemblage composition, or if community‐based thresholding can be a better alternative, and (b) whether the optimal thresholding method depends on taxa, prevalence distribution and/or species richness. Based on a comparison of different evaluation approaches, we provide guidelines for a robust community cross‐validation framework, to use if spatial or temporal independent data are unavailable. Our results showed that the selection of the “optimal” assembly strategy mostly depends on the evaluation approach rather than taxa, prevalence distribution, regional species pool or species richness. If evaluated with independent data or reliable cross‐validation, community‐based thresholding seems superior compared to single species optimisation. However, many published studies did not evaluate community projections with independent data, often leading to overoptimistic community evaluation metrics based on single species optimisation. The fact that most of the reviewed S‐SDM studies reported over‐fitted community evaluation metrics highlights the importance of developing clear evaluation guidelines for community models. Here, we move a first step in this direction, providing a framework for cross‐validation at the community level. : Forest plant distribution and environmental data in the western Swiss AlpsPlant species (presence-absence records) and environmental data for each of the 3076 forest plots. The Scherrer et al. (2018, Methods in Ecology and Evolution) paper uses the associated Dryas dataset of plant species distribution data and environmental predictors to compare a number of thresholding methods for stacked species distribution models (S-SDMs) and how to best evaluate them with an appropriate cross-validation framework. Subsets of these data have also been used in previous works by the same research group: Scherrer et al. (2017, Diversity and Distributions) and others (see full publication list under http://www.unil.ch/ecospat).forest_speciesEnvironment.csvButterfly distribution and environmental data in the western Swiss AlpsButterfly species (presence-absence records) and environmental data for each of the 192 plots. The Scherrer et al. (2018, Methods in Ecology and Evolution) paper uses the associated Dryas dataset of plant species distribution data and environmental predictors to compare a number of thresholding methods for stacked species distribution models (S-SDMs) and how to best evaluate them with an appropriate cross-validation framework. Subsets of these data have also been used in previous works by the same research group: Pellisier et al. (2012, Ecography), D’Amen et al. (2015, Global Ecology and Biogeography) and others (see full publication list under http://www.unil.ch/ecospat).butterflies_speciesEnvironment.csvGrasshopper distribution and environmental data in the western Swiss AlpsGrasshopper species (presence-absence records) and environmental data for each of the 202 plots. The Scherrer et al. (2018, Methods in Ecology and Evolution) paper uses the associated Dryas dataset of plant species distribution data and environmental predictors to compare a number of thresholding methods for stacked species distribution models (S-SDMs) and how to best evaluate them with an appropriate cross-validation framework. Subsets of these data have also been used in previous works by the same research group: Pradervand et al. (2013, Bulletin de la Société Vaudoises des Sciences Naturelles), D’Amen et al. (2015, Global Ecology and Biogeography) and others (see full publication list under http://www.unil.ch/ecospat).grasshoppers_speciesEnvironment.csv