Analysis of NSIDC Dataset Downloads and Metadata

Few research studies have quantitatively analyzed metadata elements associated with scientific data reuse. By using metadata and dataset download rates from the National Snow and Ice Data Center, we address whether there are key indicators in data repository metadata that show a statistically signif...

Full description

Bibliographic Details
Main Authors: Kolesnikova, Yulia, Lathrop, Adam, Norlander, Bree, Yan, An
Format: Other/Unknown Material
Language:unknown
Published: Center for Open Science 2017
Subjects:
Online Access:http://dx.doi.org/10.31219/osf.io/5mh9n
id crcenteros:10.31219/osf.io/5mh9n
record_format openpolar
spelling crcenteros:10.31219/osf.io/5mh9n 2023-05-15T17:14:20+02:00 Analysis of NSIDC Dataset Downloads and Metadata Kolesnikova, Yulia Lathrop, Adam Norlander, Bree Yan, An 2017 http://dx.doi.org/10.31219/osf.io/5mh9n unknown Center for Open Science https://creativecommons.org/licenses/by/4.0/legalcode CC-BY posted-content 2017 crcenteros https://doi.org/10.31219/osf.io/5mh9n 2022-12-20T10:10:30Z Few research studies have quantitatively analyzed metadata elements associated with scientific data reuse. By using metadata and dataset download rates from the National Snow and Ice Data Center, we address whether there are key indicators in data repository metadata that show a statistically significant correlation with the download count of a dataset and whether we can predict data reuse using machine learning techniques. We used the download rate by unique IP addresses for individual datasets as our dependent variable and as a proxy for data reuse. Our analysis shows that the following metadata elements in NSIDC datasets are positively correlated with download rates: year of citation, number of data formats, number of contributors, number of platforms, number of spatial coverage areas, number of locations, and number of keywords. Our results are applicable to researchers and professionals working with data and add to the small body of work addressing metadata best practices for increasing discovery of data. Other/Unknown Material National Snow and Ice Data Center COS Center for Open Science (via Crossref)
institution Open Polar
collection COS Center for Open Science (via Crossref)
op_collection_id crcenteros
language unknown
description Few research studies have quantitatively analyzed metadata elements associated with scientific data reuse. By using metadata and dataset download rates from the National Snow and Ice Data Center, we address whether there are key indicators in data repository metadata that show a statistically significant correlation with the download count of a dataset and whether we can predict data reuse using machine learning techniques. We used the download rate by unique IP addresses for individual datasets as our dependent variable and as a proxy for data reuse. Our analysis shows that the following metadata elements in NSIDC datasets are positively correlated with download rates: year of citation, number of data formats, number of contributors, number of platforms, number of spatial coverage areas, number of locations, and number of keywords. Our results are applicable to researchers and professionals working with data and add to the small body of work addressing metadata best practices for increasing discovery of data.
format Other/Unknown Material
author Kolesnikova, Yulia
Lathrop, Adam
Norlander, Bree
Yan, An
spellingShingle Kolesnikova, Yulia
Lathrop, Adam
Norlander, Bree
Yan, An
Analysis of NSIDC Dataset Downloads and Metadata
author_facet Kolesnikova, Yulia
Lathrop, Adam
Norlander, Bree
Yan, An
author_sort Kolesnikova, Yulia
title Analysis of NSIDC Dataset Downloads and Metadata
title_short Analysis of NSIDC Dataset Downloads and Metadata
title_full Analysis of NSIDC Dataset Downloads and Metadata
title_fullStr Analysis of NSIDC Dataset Downloads and Metadata
title_full_unstemmed Analysis of NSIDC Dataset Downloads and Metadata
title_sort analysis of nsidc dataset downloads and metadata
publisher Center for Open Science
publishDate 2017
url http://dx.doi.org/10.31219/osf.io/5mh9n
genre National Snow and Ice Data Center
genre_facet National Snow and Ice Data Center
op_rights https://creativecommons.org/licenses/by/4.0/legalcode
op_rightsnorm CC-BY
op_doi https://doi.org/10.31219/osf.io/5mh9n
_version_ 1766071692401049600