The ExtremeEarth project: A Solution for Generating Reliable Semantically Annotated Data

ExtremeEarth is a European H2020 project; it aims at developing analytics techniques and technologies that combine Copernicus satellite data with information and knowledge extraction, and exploiting them on ESA’s Food Security and Polar Thematic Exploitation Platforms. The current publication focuse...

Full description

Bibliographic Details
Main Authors: Dumitru, Corneliu Octavian, Schwarz, Gottfried, Yao, Wei, Datcu, Mihai
Format: Conference Object
Language:unknown
Published: 2022
Subjects:
Online Access:https://elib.dlr.de/186539/
https://lps22.esa.int/frontend/index.php
Description
Summary:ExtremeEarth is a European H2020 project; it aims at developing analytics techniques and technologies that combine Copernicus satellite data with information and knowledge extraction, and exploiting them on ESA’s Food Security and Polar Thematic Exploitation Platforms. The current publication focuses on the Polar case for which a large training dataset has been generated and demonstrates the use of different machine learning/deep learning techniques (e.g., Compression-based pattern recognition, Cascaded learning for semantic labelling, Explainable AI for SAR sea-ice content discovery, Physics-aware deep hybrid architecture). The solution proposed in the project is an active learning approach that represents a simple way to generate semantically annotated datasets from given Sentinel-1/Sentinel-2 images. Active Learning is a form of supervised machine learning in which the learning algorithm is able to interactively query some (human) information source to obtain the desired image classification outputs at new data points. The key idea behind active learning is that a machine learning algorithm (e.g., a Support Vector Machine) can achieve greater accuracy with fewer training labels if it’s allowed to choose itself among the data from which it learns. In this case, relevance feedback is included; this supports users to search for additional images of interest in a large repository. Further, any new image content classes that do not exist yet can be defined by expert users based on their specific knowledge; however, different users can give different meaning to the new classes. In this case, the number of semantic classes that can be extracted with the proposed active learning method is not fixed (as for many current state-of-the-art classification methods), but the classes are defined interactively by the users. In our case, we apply our active learning scheme to sequences of pre-selected satellite images. The idea behind this active learning approach is to obtain high classification accuracies (between 85% and ...