Horses in the Cloud: big data exploration and mining of fossil and extant Equus (Mammalia: Equidae)

Extant species of the genus Equus (e.g., horses, asses, and zebras) have a widespread distribution today on all continents except Antarctica. Extinct species of Equus represented by fossils were likewise widely distributed in the Pliocene and even more so during the Pleistocene. In order to understa...

Full description

Bibliographic Details
Published in:Paleobiology
Main Authors: Bruce J. MacFadden, Robert P. Guralnick
Format: Text
Language:English
Published: The Paleontological Society 2016
Subjects:
Online Access:https://doi.org/10.1017/pab.2016.42
id ftbioone:10.1017/pab.2016.42
record_format openpolar
spelling ftbioone:10.1017/pab.2016.42 2024-06-02T07:56:31+00:00 Horses in the Cloud: big data exploration and mining of fossil and extant Equus (Mammalia: Equidae) Bruce J. MacFadden Robert P. Guralnick Bruce J. MacFadden Robert P. Guralnick world 2016-10-21 text/HTML https://doi.org/10.1017/pab.2016.42 en eng The Paleontological Society doi:10.1017/pab.2016.42 All rights reserved. https://doi.org/10.1017/pab.2016.42 Text 2016 ftbioone https://doi.org/10.1017/pab.2016.42 2024-05-07T00:48:07Z Extant species of the genus Equus (e.g., horses, asses, and zebras) have a widespread distribution today on all continents except Antarctica. Extinct species of Equus represented by fossils were likewise widely distributed in the Pliocene and even more so during the Pleistocene. In order to understand the efficacy of “big data” for (paleo)biogeographic analyses, location records (latitude, longitude) and fossil occurrences for the genus Equus were mined and further explored from six databases, including iDigBio, Paleobiology Database, VertNet, BISON, Neotoma, and GBIF. These were chosen from a priori knowledge of where relevant data might be aggregated. We also realized that these databases have different objectives and data sources and therefore would provide a useful comparative study of the widespread taxon Equus in space and time.The mining of Equus data from these six sources yielded a combined total of 123.8 K location records, including 116.2K fossil specimens. These include individual points that are unique, that is, only occurring in one of these databases, and those that are duplicated in multiple databases. Of the six databases, three (iDigBio, Paleobiology Database, and GBIF) were judged to be the most useful in the Equus use case. Most of the databases are biased toward North American records, thus limiting the reconstruction of the actual distribution of the genus Equus in space and time outside of this continent. Although Equus has a large number of digitally accessible records, fundamentally interesting questions pertaining to evolutionary dynamics and extinction geography are still a challenge for these kinds of biodiversity databases due primarily to the lack of sufficiently dense and precise temporal data. Text Antarc* Antarctica BioOne Online Journals Paleobiology 43 1 1 14
institution Open Polar
collection BioOne Online Journals
op_collection_id ftbioone
language English
description Extant species of the genus Equus (e.g., horses, asses, and zebras) have a widespread distribution today on all continents except Antarctica. Extinct species of Equus represented by fossils were likewise widely distributed in the Pliocene and even more so during the Pleistocene. In order to understand the efficacy of “big data” for (paleo)biogeographic analyses, location records (latitude, longitude) and fossil occurrences for the genus Equus were mined and further explored from six databases, including iDigBio, Paleobiology Database, VertNet, BISON, Neotoma, and GBIF. These were chosen from a priori knowledge of where relevant data might be aggregated. We also realized that these databases have different objectives and data sources and therefore would provide a useful comparative study of the widespread taxon Equus in space and time.The mining of Equus data from these six sources yielded a combined total of 123.8 K location records, including 116.2K fossil specimens. These include individual points that are unique, that is, only occurring in one of these databases, and those that are duplicated in multiple databases. Of the six databases, three (iDigBio, Paleobiology Database, and GBIF) were judged to be the most useful in the Equus use case. Most of the databases are biased toward North American records, thus limiting the reconstruction of the actual distribution of the genus Equus in space and time outside of this continent. Although Equus has a large number of digitally accessible records, fundamentally interesting questions pertaining to evolutionary dynamics and extinction geography are still a challenge for these kinds of biodiversity databases due primarily to the lack of sufficiently dense and precise temporal data.
author2 Bruce J. MacFadden
Robert P. Guralnick
format Text
author Bruce J. MacFadden
Robert P. Guralnick
spellingShingle Bruce J. MacFadden
Robert P. Guralnick
Horses in the Cloud: big data exploration and mining of fossil and extant Equus (Mammalia: Equidae)
author_facet Bruce J. MacFadden
Robert P. Guralnick
author_sort Bruce J. MacFadden
title Horses in the Cloud: big data exploration and mining of fossil and extant Equus (Mammalia: Equidae)
title_short Horses in the Cloud: big data exploration and mining of fossil and extant Equus (Mammalia: Equidae)
title_full Horses in the Cloud: big data exploration and mining of fossil and extant Equus (Mammalia: Equidae)
title_fullStr Horses in the Cloud: big data exploration and mining of fossil and extant Equus (Mammalia: Equidae)
title_full_unstemmed Horses in the Cloud: big data exploration and mining of fossil and extant Equus (Mammalia: Equidae)
title_sort horses in the cloud: big data exploration and mining of fossil and extant equus (mammalia: equidae)
publisher The Paleontological Society
publishDate 2016
url https://doi.org/10.1017/pab.2016.42
op_coverage world
genre Antarc*
Antarctica
genre_facet Antarc*
Antarctica
op_source https://doi.org/10.1017/pab.2016.42
op_relation doi:10.1017/pab.2016.42
op_rights All rights reserved.
op_doi https://doi.org/10.1017/pab.2016.42
container_title Paleobiology
container_volume 43
container_issue 1
container_start_page 1
op_container_end_page 14
_version_ 1800756658374180864