Harmonizing heterogeneous multi-proxy data from lake systems

When performing spatial-temporal investigations of multiple lake systems, geoscientists face the challenge of dealing with complex and heterogeneous data of different types, structure, and format. To support comparability, it is necessary to transform such data into a uniform format that ensures syn...

Full description

Bibliographic Details
Published in:Computers & Geosciences
Main Authors: Pfalz, Gregor, Diekmann, Bernhard, Freytag, Johann-Christoph, Biskaborn, Boris K.
Format: Article in Journal/Newspaper
Language:unknown
Published: 2021
Subjects:
Online Access:https://epic.awi.de/id/eprint/55325/
https://epic.awi.de/id/eprint/55325/1/Pfalz_et_al_2021.pdf
https://doi.org/10.1016/j.cageo.2021.104791
https://hdl.handle.net/10013/epic.8de95a70-abeb-4bb3-a7f5-ee5e15f39dea
https://hdl.handle.net/
id ftawi:oai:epic.awi.de:55325
record_format openpolar
spelling ftawi:oai:epic.awi.de:55325 2023-05-15T15:08:26+02:00 Harmonizing heterogeneous multi-proxy data from lake systems Pfalz, Gregor Diekmann, Bernhard Freytag, Johann-Christoph Biskaborn, Boris K. 2021 application/pdf https://epic.awi.de/id/eprint/55325/ https://epic.awi.de/id/eprint/55325/1/Pfalz_et_al_2021.pdf https://doi.org/10.1016/j.cageo.2021.104791 https://hdl.handle.net/10013/epic.8de95a70-abeb-4bb3-a7f5-ee5e15f39dea https://hdl.handle.net/ unknown https://epic.awi.de/id/eprint/55325/1/Pfalz_et_al_2021.pdf https://hdl.handle.net/ Pfalz, G. orcid:0000-0003-1218-177X , Diekmann, B. orcid:0000-0001-5129-3649 , Freytag, J. C. and Biskaborn, B. K. orcid:0000-0003-2378-0348 (2021) Harmonizing heterogeneous multi-proxy data from lake systems , Computers & Geosciences, 153 , p. 104791 . doi:10.1016/j.cageo.2021.104791 <https://doi.org/10.1016/j.cageo.2021.104791> , hdl:10013/epic.8de95a70-abeb-4bb3-a7f5-ee5e15f39dea EPIC3Computers & Geosciences, 153, pp. 104791, ISSN: 00983004 Article isiRev 2021 ftawi https://doi.org/10.1016/j.cageo.2021.104791 2022-01-10T00:09:29Z When performing spatial-temporal investigations of multiple lake systems, geoscientists face the challenge of dealing with complex and heterogeneous data of different types, structure, and format. To support comparability, it is necessary to transform such data into a uniform format that ensures syntactic and semantic comparability. This paper presents a data science approach for transforming research data from different lake sediment cores into a coherent framework. For this purpose, we collected published and unpublished data from paleolimnological investigations of Arctic lake systems. Our approach adapted methods from the database field, such as developing entity-relationship (ER) diagrams, to understand the conceptual structure of the data independently of the source. We demonstrated the feasibility of our approach by transforming our ER diagram into a database schema for PostgreSQL, a popular database management system (DBMS). We validated our approach by conducting a comparative analysis on a set of acquired data, hereby focusing on the comparison of total organic carbon and bromine content in eight selected sediment cores. Still, we encountered serious obstacles in the development of the ER model. Heterogeneous structures within collected data made an automatic data integration impossible. Additionally, we realized that missing error information hampers the development of a conceptual model. Despite the strong initial heterogeneity of the original data, our harmonized dataset leads to comparable datasets, enabling numerical inter-proxy and inter-lake comparison. Article in Journal/Newspaper Arctic Alfred Wegener Institute for Polar- and Marine Research (AWI): ePIC (electronic Publication Information Center) Arctic Arctic Lake ENVELOPE(-130.826,-130.826,57.231,57.231) Computers & Geosciences 153 104791
institution Open Polar
collection Alfred Wegener Institute for Polar- and Marine Research (AWI): ePIC (electronic Publication Information Center)
op_collection_id ftawi
language unknown
description When performing spatial-temporal investigations of multiple lake systems, geoscientists face the challenge of dealing with complex and heterogeneous data of different types, structure, and format. To support comparability, it is necessary to transform such data into a uniform format that ensures syntactic and semantic comparability. This paper presents a data science approach for transforming research data from different lake sediment cores into a coherent framework. For this purpose, we collected published and unpublished data from paleolimnological investigations of Arctic lake systems. Our approach adapted methods from the database field, such as developing entity-relationship (ER) diagrams, to understand the conceptual structure of the data independently of the source. We demonstrated the feasibility of our approach by transforming our ER diagram into a database schema for PostgreSQL, a popular database management system (DBMS). We validated our approach by conducting a comparative analysis on a set of acquired data, hereby focusing on the comparison of total organic carbon and bromine content in eight selected sediment cores. Still, we encountered serious obstacles in the development of the ER model. Heterogeneous structures within collected data made an automatic data integration impossible. Additionally, we realized that missing error information hampers the development of a conceptual model. Despite the strong initial heterogeneity of the original data, our harmonized dataset leads to comparable datasets, enabling numerical inter-proxy and inter-lake comparison.
format Article in Journal/Newspaper
author Pfalz, Gregor
Diekmann, Bernhard
Freytag, Johann-Christoph
Biskaborn, Boris K.
spellingShingle Pfalz, Gregor
Diekmann, Bernhard
Freytag, Johann-Christoph
Biskaborn, Boris K.
Harmonizing heterogeneous multi-proxy data from lake systems
author_facet Pfalz, Gregor
Diekmann, Bernhard
Freytag, Johann-Christoph
Biskaborn, Boris K.
author_sort Pfalz, Gregor
title Harmonizing heterogeneous multi-proxy data from lake systems
title_short Harmonizing heterogeneous multi-proxy data from lake systems
title_full Harmonizing heterogeneous multi-proxy data from lake systems
title_fullStr Harmonizing heterogeneous multi-proxy data from lake systems
title_full_unstemmed Harmonizing heterogeneous multi-proxy data from lake systems
title_sort harmonizing heterogeneous multi-proxy data from lake systems
publishDate 2021
url https://epic.awi.de/id/eprint/55325/
https://epic.awi.de/id/eprint/55325/1/Pfalz_et_al_2021.pdf
https://doi.org/10.1016/j.cageo.2021.104791
https://hdl.handle.net/10013/epic.8de95a70-abeb-4bb3-a7f5-ee5e15f39dea
https://hdl.handle.net/
long_lat ENVELOPE(-130.826,-130.826,57.231,57.231)
geographic Arctic
Arctic Lake
geographic_facet Arctic
Arctic Lake
genre Arctic
genre_facet Arctic
op_source EPIC3Computers & Geosciences, 153, pp. 104791, ISSN: 00983004
op_relation https://epic.awi.de/id/eprint/55325/1/Pfalz_et_al_2021.pdf
https://hdl.handle.net/
Pfalz, G. orcid:0000-0003-1218-177X , Diekmann, B. orcid:0000-0001-5129-3649 , Freytag, J. C. and Biskaborn, B. K. orcid:0000-0003-2378-0348 (2021) Harmonizing heterogeneous multi-proxy data from lake systems , Computers & Geosciences, 153 , p. 104791 . doi:10.1016/j.cageo.2021.104791 <https://doi.org/10.1016/j.cageo.2021.104791> , hdl:10013/epic.8de95a70-abeb-4bb3-a7f5-ee5e15f39dea
op_doi https://doi.org/10.1016/j.cageo.2021.104791
container_title Computers & Geosciences
container_volume 153
container_start_page 104791
_version_ 1766339802481819648