Self-Supervised Learning from Semantically Imprecise Data

Learning from imprecise labels such as "animal" or "bird", but making precise predictions like "snow bunting" at inference time is an important capability for any classifier when expertly labeled training data is scarce. Contributions by volunteers or results of web cra...

Full description

Bibliographic Details
Main Authors: Brust, Clemens-Alexander, Barz, Björn, Denzler, Joachim
Format: Text
Language:unknown
Published: 2021
Subjects:
Online Access:http://arxiv.org/abs/2104.10901
id ftarxivpreprints:oai:arXiv.org:2104.10901
record_format openpolar
spelling ftarxivpreprints:oai:arXiv.org:2104.10901 2023-09-05T13:23:08+02:00 Self-Supervised Learning from Semantically Imprecise Data Brust, Clemens-Alexander Barz, Björn Denzler, Joachim 2021-04-22 http://arxiv.org/abs/2104.10901 unknown http://arxiv.org/abs/2104.10901 Computer Science - Computer Vision and Pattern Recognition text 2021 ftarxivpreprints 2023-08-16T16:27:25Z Learning from imprecise labels such as "animal" or "bird", but making precise predictions like "snow bunting" at inference time is an important capability for any classifier when expertly labeled training data is scarce. Contributions by volunteers or results of web crawling lack precision in this manner, but are still valuable. And crucially, these weakly labeled examples are available in larger quantities for lower cost than high-quality bespoke training data. CHILLAX, a recently proposed method to tackle this task, leverages a hierarchical classifier to learn from imprecise labels. However, it has two major limitations. First, it does not learn from examples labeled as the root of the hierarchy, e.g., "object". Second, an extrapolation of annotations to precise labels is only performed at test time, where confident extrapolations could be already used as training data. In this work, we extend CHILLAX with a self-supervised scheme using constrained semantic extrapolation to generate pseudo-labels. This addresses the second concern, which in turn solves the first problem, enabling an even weaker supervision requirement than CHILLAX. We evaluate our approach empirically, showing that our method allows for a consistent accuracy improvement of 0.84 to 1.19 percent points over CHILLAX and is suitable as a drop-in replacement without any negative consequences such as longer training times. Comment: 9 pages. Accepted for publication at VISAPP 2022 Text Snow Bunting ArXiv.org (Cornell University Library)
institution Open Polar
collection ArXiv.org (Cornell University Library)
op_collection_id ftarxivpreprints
language unknown
topic Computer Science - Computer Vision and Pattern Recognition
spellingShingle Computer Science - Computer Vision and Pattern Recognition
Brust, Clemens-Alexander
Barz, Björn
Denzler, Joachim
Self-Supervised Learning from Semantically Imprecise Data
topic_facet Computer Science - Computer Vision and Pattern Recognition
description Learning from imprecise labels such as "animal" or "bird", but making precise predictions like "snow bunting" at inference time is an important capability for any classifier when expertly labeled training data is scarce. Contributions by volunteers or results of web crawling lack precision in this manner, but are still valuable. And crucially, these weakly labeled examples are available in larger quantities for lower cost than high-quality bespoke training data. CHILLAX, a recently proposed method to tackle this task, leverages a hierarchical classifier to learn from imprecise labels. However, it has two major limitations. First, it does not learn from examples labeled as the root of the hierarchy, e.g., "object". Second, an extrapolation of annotations to precise labels is only performed at test time, where confident extrapolations could be already used as training data. In this work, we extend CHILLAX with a self-supervised scheme using constrained semantic extrapolation to generate pseudo-labels. This addresses the second concern, which in turn solves the first problem, enabling an even weaker supervision requirement than CHILLAX. We evaluate our approach empirically, showing that our method allows for a consistent accuracy improvement of 0.84 to 1.19 percent points over CHILLAX and is suitable as a drop-in replacement without any negative consequences such as longer training times. Comment: 9 pages. Accepted for publication at VISAPP 2022
format Text
author Brust, Clemens-Alexander
Barz, Björn
Denzler, Joachim
author_facet Brust, Clemens-Alexander
Barz, Björn
Denzler, Joachim
author_sort Brust, Clemens-Alexander
title Self-Supervised Learning from Semantically Imprecise Data
title_short Self-Supervised Learning from Semantically Imprecise Data
title_full Self-Supervised Learning from Semantically Imprecise Data
title_fullStr Self-Supervised Learning from Semantically Imprecise Data
title_full_unstemmed Self-Supervised Learning from Semantically Imprecise Data
title_sort self-supervised learning from semantically imprecise data
publishDate 2021
url http://arxiv.org/abs/2104.10901
genre Snow Bunting
genre_facet Snow Bunting
op_relation http://arxiv.org/abs/2104.10901
_version_ 1776203717051154432