Improving Deep Metric Learning by Divide and Conquer

Deep metric learning (DML) is a cornerstone of many computer vision applications. It aims at learning a mapping from the input domain to an embedding space, where semantically similar objects are located nearby and dissimilar objects far from another. The target similarity on the training data is de...

Full description

Bibliographic Details
Main Authors:	Sanakoyeu, Artsiom, Ma, Pingchuan, Tschernezki, Vadim, Ommer, Björn
Format:	Text
Language:	unknown
Published:	2021
Subjects:	Computer Science - Computer Vision and Pattern Recognition DML
Online Access:	http://arxiv.org/abs/2109.04003

id	ftarxivpreprints:oai:arXiv.org:2109.04003
record_format	openpolar
spelling	ftarxivpreprints:oai:arXiv.org:2109.04003 2023-09-05T13:19:06+02:00 Improving Deep Metric Learning by Divide and Conquer Sanakoyeu, Artsiom Ma, Pingchuan Tschernezki, Vadim Ommer, Björn 2021-09-08 http://arxiv.org/abs/2109.04003 unknown http://arxiv.org/abs/2109.04003 Computer Science - Computer Vision and Pattern Recognition text 2021 ftarxivpreprints 2023-08-16T16:40:20Z Deep metric learning (DML) is a cornerstone of many computer vision applications. It aims at learning a mapping from the input domain to an embedding space, where semantically similar objects are located nearby and dissimilar objects far from another. The target similarity on the training data is defined by user in form of ground-truth class labels. However, while the embedding space learns to mimic the user-provided similarity on the training data, it should also generalize to novel categories not seen during training. Besides user-provided groundtruth training labels, a lot of additional visual factors (such as viewpoint changes or shape peculiarities) exist and imply different notions of similarity between objects, affecting the generalization on the images unseen during training. However, existing approaches usually directly learn a single embedding space on all available training data, struggling to encode all different types of relationships, and do not generalize well. We propose to build a more expressive representation by jointly splitting the embedding space and the data hierarchically into smaller sub-parts. We successively focus on smaller subsets of the training data, reducing its variance and learning a different embedding subspace for each data subset. Moreover, the subspaces are learned jointly to cover not only the intricacies, but the breadth of the data as well. Only after that, we build the final embedding from the subspaces in the conquering stage. The proposed algorithm acts as a transparent wrapper that can be placed around arbitrary existing DML methods. Our approach significantly improves upon the state-of-the-art on image retrieval, clustering, and re-identification tasks evaluated using CUB200-2011, CARS196, Stanford Online Products, In-shop Clothes, and PKU VehicleID datasets. Comment: Accepted to PAMI. Source code: https://github.com/CompVis/metric-learning-divide-and-conquer-improved Text DML ArXiv.org (Cornell University Library)
institution	Open Polar
collection	ArXiv.org (Cornell University Library)
op_collection_id	ftarxivpreprints
language	unknown
topic	Computer Science - Computer Vision and Pattern Recognition
spellingShingle	Computer Science - Computer Vision and Pattern Recognition Sanakoyeu, Artsiom Ma, Pingchuan Tschernezki, Vadim Ommer, Björn Improving Deep Metric Learning by Divide and Conquer
topic_facet	Computer Science - Computer Vision and Pattern Recognition
description	Deep metric learning (DML) is a cornerstone of many computer vision applications. It aims at learning a mapping from the input domain to an embedding space, where semantically similar objects are located nearby and dissimilar objects far from another. The target similarity on the training data is defined by user in form of ground-truth class labels. However, while the embedding space learns to mimic the user-provided similarity on the training data, it should also generalize to novel categories not seen during training. Besides user-provided groundtruth training labels, a lot of additional visual factors (such as viewpoint changes or shape peculiarities) exist and imply different notions of similarity between objects, affecting the generalization on the images unseen during training. However, existing approaches usually directly learn a single embedding space on all available training data, struggling to encode all different types of relationships, and do not generalize well. We propose to build a more expressive representation by jointly splitting the embedding space and the data hierarchically into smaller sub-parts. We successively focus on smaller subsets of the training data, reducing its variance and learning a different embedding subspace for each data subset. Moreover, the subspaces are learned jointly to cover not only the intricacies, but the breadth of the data as well. Only after that, we build the final embedding from the subspaces in the conquering stage. The proposed algorithm acts as a transparent wrapper that can be placed around arbitrary existing DML methods. Our approach significantly improves upon the state-of-the-art on image retrieval, clustering, and re-identification tasks evaluated using CUB200-2011, CARS196, Stanford Online Products, In-shop Clothes, and PKU VehicleID datasets. Comment: Accepted to PAMI. Source code: https://github.com/CompVis/metric-learning-divide-and-conquer-improved
format	Text
author	Sanakoyeu, Artsiom Ma, Pingchuan Tschernezki, Vadim Ommer, Björn
author_facet	Sanakoyeu, Artsiom Ma, Pingchuan Tschernezki, Vadim Ommer, Björn
author_sort	Sanakoyeu, Artsiom
title	Improving Deep Metric Learning by Divide and Conquer
title_short	Improving Deep Metric Learning by Divide and Conquer
title_full	Improving Deep Metric Learning by Divide and Conquer
title_fullStr	Improving Deep Metric Learning by Divide and Conquer
title_full_unstemmed	Improving Deep Metric Learning by Divide and Conquer
title_sort	improving deep metric learning by divide and conquer
publishDate	2021
url	http://arxiv.org/abs/2109.04003
genre	DML
genre_facet	DML
op_relation	http://arxiv.org/abs/2109.04003
_version_	1776199914612588544

Improving Deep Metric Learning by Divide and Conquer

Similar Items