Analysis of NSIDC Dataset Downloads and Metadata
Few research studies have quantitatively analyzed metadata elements associated with scientific data reuse. By using metadata and dataset download rates from the National Snow and Ice Data Center, we address whether there are key indicators in data repository metadata that show a statistically signif...
Main Authors: | , , , |
---|---|
Format: | Other/Unknown Material |
Language: | unknown |
Published: |
Center for Open Science
2017
|
Subjects: | |
Online Access: | https://doi.org/10.31219/osf.io/5mh9n |
_version_ | 1829934351313272832 |
---|---|
author | Kolesnikova, Yulia Lathrop, Adam Norlander, Bree Yan, An |
author_facet | Kolesnikova, Yulia Lathrop, Adam Norlander, Bree Yan, An |
author_sort | Kolesnikova, Yulia |
collection | COS Center for Open Science |
description | Few research studies have quantitatively analyzed metadata elements associated with scientific data reuse. By using metadata and dataset download rates from the National Snow and Ice Data Center, we address whether there are key indicators in data repository metadata that show a statistically significant correlation with the download count of a dataset and whether we can predict data reuse using machine learning techniques. We used the download rate by unique IP addresses for individual datasets as our dependent variable and as a proxy for data reuse. Our analysis shows that the following metadata elements in NSIDC datasets are positively correlated with download rates: year of citation, number of data formats, number of contributors, number of platforms, number of spatial coverage areas, number of locations, and number of keywords. Our results are applicable to researchers and professionals working with data and add to the small body of work addressing metadata best practices for increasing discovery of data. |
format | Other/Unknown Material |
genre | National Snow and Ice Data Center |
genre_facet | National Snow and Ice Data Center |
id | crcenteros:10.31219/osf.io/5mh9n |
institution | Open Polar |
language | unknown |
op_collection_id | crcenteros |
op_doi | https://doi.org/10.31219/osf.io/5mh9n |
op_rights | https://creativecommons.org/licenses/by/4.0/legalcode |
publishDate | 2017 |
publisher | Center for Open Science |
record_format | openpolar |
spelling | crcenteros:10.31219/osf.io/5mh9n 2025-04-20T14:40:41+00:00 Analysis of NSIDC Dataset Downloads and Metadata Kolesnikova, Yulia Lathrop, Adam Norlander, Bree Yan, An 2017 https://doi.org/10.31219/osf.io/5mh9n unknown Center for Open Science https://creativecommons.org/licenses/by/4.0/legalcode posted-content 2017 crcenteros https://doi.org/10.31219/osf.io/5mh9n 2025-04-02T04:22:00Z Few research studies have quantitatively analyzed metadata elements associated with scientific data reuse. By using metadata and dataset download rates from the National Snow and Ice Data Center, we address whether there are key indicators in data repository metadata that show a statistically significant correlation with the download count of a dataset and whether we can predict data reuse using machine learning techniques. We used the download rate by unique IP addresses for individual datasets as our dependent variable and as a proxy for data reuse. Our analysis shows that the following metadata elements in NSIDC datasets are positively correlated with download rates: year of citation, number of data formats, number of contributors, number of platforms, number of spatial coverage areas, number of locations, and number of keywords. Our results are applicable to researchers and professionals working with data and add to the small body of work addressing metadata best practices for increasing discovery of data. Other/Unknown Material National Snow and Ice Data Center COS Center for Open Science |
spellingShingle | Kolesnikova, Yulia Lathrop, Adam Norlander, Bree Yan, An Analysis of NSIDC Dataset Downloads and Metadata |
title | Analysis of NSIDC Dataset Downloads and Metadata |
title_full | Analysis of NSIDC Dataset Downloads and Metadata |
title_fullStr | Analysis of NSIDC Dataset Downloads and Metadata |
title_full_unstemmed | Analysis of NSIDC Dataset Downloads and Metadata |
title_short | Analysis of NSIDC Dataset Downloads and Metadata |
title_sort | analysis of nsidc dataset downloads and metadata |
url | https://doi.org/10.31219/osf.io/5mh9n |