Mission: Implausible — Revealing rogue marine species in records across biodiversity data platforms

Online biodiversity platforms publish datasets with graphic tools to help with quality control of submitted records, but more could be done to make the data robust for ecological analyses. Attention has focused mostly on automating tools for obvious errors, including misspelled names and synonyms, d...

Full description

Bibliographic Details
Published in:Biodiversity Information Science and Standards
Main Authors: Nozères, Claude, Kennedy, Mary
Format: Article in Journal/Newspaper
Language:unknown
Published: Pensoft Publishers 2019
Subjects:
Online Access:https://doi.org/10.3897/biss.3.36002
Description
Summary:Online biodiversity platforms publish datasets with graphic tools to help with quality control of submitted records, but more could be done to make the data robust for ecological analyses. Attention has focused mostly on automating tools for obvious errors, including misspelled names and synonyms, dates, or coordinates. However, a manual review of species identifications and distributions may uncover improbable records, such as a species reported in an area far from its usual range, or a rare species found in an area that has many more records of a related species. Examples are shown by constructing checklists in the Northwest Atlantic, using information from the World Register of Marine Species (WoRMS, http://www.marinespecies.org) and the Ocean Biogeographic Information System (OBIS, https://obis.org). Reviewing rare species records revealed some misidentifications, but in other instances, the rare species was valid while it was the commonly reported species that needed correction. Confirmations were obtained by comparing records from different regions, but also across platforms, including photos from observers on iNaturalist Canada (https://inaturalist.ca), genetic analyses on Barcode of Life Data systems (BOLD, http://www.boldsystems.org), and literature in the Biodiversity Heritage Library (BHL, https://www.biodiversitylibrary.org). While this exercise succeeded in validating the marine taxa of a region, it is an obvious candidate for automation in three areas: 1) flagging records of improbable taxa in a region, 2) comparing records with different types of information (e.g., specimen photos, genetic groupings, or literature records), and 3) updating users and providers when records get flagged as unusual or are modified. The first approach could be explored using online graphics tools or R software packages (rOpenSci, https://ropensci.org). The second toolset, comparing records across platforms, is partially realized with some linkages already operating between WoRMS, OBIS, BOLD, BHL, iNaturalist, and the ...