Overcoming the pitfalls of categorizing continuous variables in ecology and evolutionary biology ...

Many metrics in biological research – from body size to life history timing to environmental metrics – are measured continuously (e.g., body size in grams) but analyzed as categories (e.g., large versus small). The pitfalls of categorization are well-recognized in statistics, but many scientists in...

Full description

Bibliographic Details
Main Authors: Beltran, Roxanne, Tarwater, Corey
Format: Dataset
Language:English
Published: Dryad 2023
Subjects:
Online Access:https://dx.doi.org/10.5061/dryad.5x69p8d9r
https://datadryad.org/stash/dataset/doi:10.5061/dryad.5x69p8d9r
Description
Summary:Many metrics in biological research – from body size to life history timing to environmental metrics – are measured continuously (e.g., body size in grams) but analyzed as categories (e.g., large versus small). The pitfalls of categorization are well-recognized in statistics, but many scientists in the fields of ecology, evolution, and behavior may not be aware of this literature. These fields lack a review of common examples and feasible solutions to avoid the hazards of categorizing continuous data. Our goal was to summarize current practices of categorizing continuous predictors in ecology and evolutionary biology and provide guidance for overcoming those pitfalls. We conducted a mini-review of 72 recent publications in six popular journals to quantify the prevalence of categorization. We then summarized commonly categorized metrics and simulated a dataset to demonstrate the drawbacks of categorization using common metrics and realistic examples from ecology and evolutionary biology. We show that ... : # Overcoming the pitfalls of categorizing continuous variables in ecology and evolutionary biology [https://doi.org/10.5061/dryad.5x69p8d9r](https://doi.org/10.5061/dryad.5x69p8d9r) We simulated data to quantify the detrimental impact of categorizing continuous variables using various statistical breakpoints and sample sizes (details below). To give the example biological relevance, we created a dataset that illustrates the complexity of life history theory and climate change impacts, and contains a predictor variable that is frequently categorized (Table 2) - reproductive timing in one year and its effect on body size in the following year. A reasonable research question would be: How does timing of reproduction in year t influence body mass at the start of the breeding season in year t+1? For illustrative purposes, let’s say we collected data from individually banded penguins in Antarctica. Based on the mechanistic relationships between seasonally available sea ice and food availability, we hypothesize ...