Data Driven Documents (D3) polar data visualizations for the Text REtrieval Conference Dynamic Domain (TREC DD) polar dataset, 2016

The National Institutes of Standards and Technology (NIST) Text Retrieval Conference (TREC) Polar Dynamic Domain dataset (TREC-Polar-DD dataset) was collected over the past few years across various computer science courses (CSCI 572) at the University of Southern California (USC) and in collaboratio...

Full description

Bibliographic Details
Main Author: Chris Mattmann
Format: Dataset
Language:unknown
Published: Arctic Data Center 2017
Subjects:
Online Access:https://doi.org/10.18739/A2445HC95
Description
Summary:The National Institutes of Standards and Technology (NIST) Text Retrieval Conference (TREC) Polar Dynamic Domain dataset (TREC-Polar-DD dataset) was collected over the past few years across various computer science courses (CSCI 572) at the University of Southern California (USC) and in collaboration with the NSF Polar CyberInfrastructure program, and the DARPA Memex program and its TREC Dynamic Domain track. TREC-Polar-DD is a diverse dataset that was collected with the Apache Tika, Nutch and Solr software systems, and includes over 158GB in total of data comprising 1.7 million web pages obtained during 17 distinct web crawls from the National Science Foundation Advanced Cooperative Arctic Data and Information System (ACADIS), the National Snow and Ice Data Center (NSIDC) Arctic Data Explorer (ADE), and the National Aeronautics and Space Administration Antarctic Master Directory (AMD) data archives. This web crawl data (http://doi.org/doi:10.18739/A2280J), along with HTML and javascript/D3 source code was used to produce the data visualizations found in polar.usc.edu-gh-pages.zip.