An open source framework for metadata exploration and discovery of Polar Data

This project will deliver an open source framework for metadata exploration, automatic text mining and information retrieval of polar data that uses the Apache Tika technology. Apache Tika is currently the de facto "babel fish", aiding in the automatic MIME detection, text extraction, and...

Full description

Bibliographic Details
Main Author: Christian Mattmann
Format: Dataset
Language:unknown
Published: Arctic Data Center 2014
Subjects:
PCI
Online Access:https://search.dataone.org/view/urn:uuid:37b2ab59-ec4a-4330-9e45-5d918dd8a50e
id dataone:urn:uuid:37b2ab59-ec4a-4330-9e45-5d918dd8a50e
record_format openpolar
spelling dataone:urn:uuid:37b2ab59-ec4a-4330-9e45-5d918dd8a50e 2023-11-08T14:14:13+01:00 An open source framework for metadata exploration and discovery of Polar Data Christian Mattmann Global ENVELOPE(-180.0,180.0,90.0,-90.0) BEGINDATE: 2015-01-01T00:00:00Z ENDDATE: 2016-01-01T00:00:00Z 2014-05-06T00:00:00Z https://search.dataone.org/view/urn:uuid:37b2ab59-ec4a-4330-9e45-5d918dd8a50e unknown Arctic Data Center PCI Dataset 2014 dataone:urn:node:ARCTIC 2023-11-08T13:40:37Z This project will deliver an open source framework for metadata exploration, automatic text mining and information retrieval of polar data that uses the Apache Tika technology. Apache Tika is currently the de facto "babel fish", aiding in the automatic MIME detection, text extraction, and metadata classification of over 1200 data formats. The PI will expand Tika to handle polar data and scientific data formats, making Polar data more easily available, searchable, and retrievable by all major content management systems. The proposed activity will lay the framework for a thorough automatically generated inventory of polar metadata and data. Expanding Tika to handle polar data will also naturally invite the technology/open source community to deal with polar use cases, helping to increase understanding of the arctic. The resultant software produced through effort will be disseminated to the software and polar communities through the Apache Software Foundation. A computer science graduate student and postdoc will be exposed to Cryosphere and Arctic data, helping to train the next generation of cross disciplinary data scientists in the domain. The PI's Search Engines (20-40 students annual enrollment) and Software Architecture (30-50 students annual enrollment) graduate courses at USC will benefit from the Arctic cyberinfrastructure use cases disseminated through course projects and lecture material. The PI will also work collaboratively with NSF-funded projects dealing with projects focusing on the archiving, discovery and access of polar data, such as ACADIS and the Antarctic Master Directory. Dataset Antarc* Antarctic Arctic Arctic Data Center (via DataONE) Antarctic Arctic Babel ENVELOPE(-61.401,-61.401,-63.885,-63.885) The Antarctic Tika ENVELOPE(7.590,7.590,63.223,63.223)
institution Open Polar
collection Arctic Data Center (via DataONE)
op_collection_id dataone:urn:node:ARCTIC
language unknown
topic PCI
spellingShingle PCI
Christian Mattmann
An open source framework for metadata exploration and discovery of Polar Data
topic_facet PCI
description This project will deliver an open source framework for metadata exploration, automatic text mining and information retrieval of polar data that uses the Apache Tika technology. Apache Tika is currently the de facto "babel fish", aiding in the automatic MIME detection, text extraction, and metadata classification of over 1200 data formats. The PI will expand Tika to handle polar data and scientific data formats, making Polar data more easily available, searchable, and retrievable by all major content management systems. The proposed activity will lay the framework for a thorough automatically generated inventory of polar metadata and data. Expanding Tika to handle polar data will also naturally invite the technology/open source community to deal with polar use cases, helping to increase understanding of the arctic. The resultant software produced through effort will be disseminated to the software and polar communities through the Apache Software Foundation. A computer science graduate student and postdoc will be exposed to Cryosphere and Arctic data, helping to train the next generation of cross disciplinary data scientists in the domain. The PI's Search Engines (20-40 students annual enrollment) and Software Architecture (30-50 students annual enrollment) graduate courses at USC will benefit from the Arctic cyberinfrastructure use cases disseminated through course projects and lecture material. The PI will also work collaboratively with NSF-funded projects dealing with projects focusing on the archiving, discovery and access of polar data, such as ACADIS and the Antarctic Master Directory.
format Dataset
author Christian Mattmann
author_facet Christian Mattmann
author_sort Christian Mattmann
title An open source framework for metadata exploration and discovery of Polar Data
title_short An open source framework for metadata exploration and discovery of Polar Data
title_full An open source framework for metadata exploration and discovery of Polar Data
title_fullStr An open source framework for metadata exploration and discovery of Polar Data
title_full_unstemmed An open source framework for metadata exploration and discovery of Polar Data
title_sort open source framework for metadata exploration and discovery of polar data
publisher Arctic Data Center
publishDate 2014
url https://search.dataone.org/view/urn:uuid:37b2ab59-ec4a-4330-9e45-5d918dd8a50e
op_coverage Global
ENVELOPE(-180.0,180.0,90.0,-90.0)
BEGINDATE: 2015-01-01T00:00:00Z ENDDATE: 2016-01-01T00:00:00Z
long_lat ENVELOPE(-61.401,-61.401,-63.885,-63.885)
ENVELOPE(7.590,7.590,63.223,63.223)
geographic Antarctic
Arctic
Babel
The Antarctic
Tika
geographic_facet Antarctic
Arctic
Babel
The Antarctic
Tika
genre Antarc*
Antarctic
Arctic
genre_facet Antarc*
Antarctic
Arctic
_version_ 1782011984480305152