A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses

Recently, substantial research efforts in Deep Metric Learning (DML) focused on designing complex pairwise-distance losses, which require convoluted schemes to ease optimization, such as sample mining or pair weighting. The standard cross-entropy loss for classification has been largely overlooked i...

Full description

Bibliographic Details
Main Authors:	Boudiaf, Malik, Rony, Jérôme, Ziko, Imtiaz Masud, Granger, Eric, Pedersoli, Marco, Piantanida, Pablo, Ayed, Ismail Ben
Format:	Article in Journal/Newspaper
Language:	unknown
Published:	arXiv 2020
Subjects:	Machine Learning cs.LG Computer Vision and Pattern Recognition cs.CV Machine Learning stat.ML FOS Computer and information sciences DML
Online Access:	https://dx.doi.org/10.48550/arxiv.2003.08983 https://arxiv.org/abs/2003.08983

id	ftdatacite:10.48550/arxiv.2003.08983
record_format	openpolar
spelling	ftdatacite:10.48550/arxiv.2003.08983 2023-05-15T16:01:31+02:00 A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses Boudiaf, Malik Rony, Jérôme Ziko, Imtiaz Masud Granger, Eric Pedersoli, Marco Piantanida, Pablo Ayed, Ismail Ben 2020 https://dx.doi.org/10.48550/arxiv.2003.08983 https://arxiv.org/abs/2003.08983 unknown arXiv arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/ Machine Learning cs.LG Computer Vision and Pattern Recognition cs.CV Machine Learning stat.ML FOS Computer and information sciences Article CreativeWork article Preprint 2020 ftdatacite https://doi.org/10.48550/arxiv.2003.08983 2022-03-10T16:03:47Z Recently, substantial research efforts in Deep Metric Learning (DML) focused on designing complex pairwise-distance losses, which require convoluted schemes to ease optimization, such as sample mining or pair weighting. The standard cross-entropy loss for classification has been largely overlooked in DML. On the surface, the cross-entropy may seem unrelated and irrelevant to metric learning as it does not explicitly involve pairwise distances. However, we provide a theoretical analysis that links the cross-entropy to several well-known and recent pairwise losses. Our connections are drawn from two different perspectives: one based on an explicit optimization insight; the other on discriminative and generative views of the mutual information between the labels and the learned features. First, we explicitly demonstrate that the cross-entropy is an upper bound on a new pairwise loss, which has a structure similar to various pairwise losses: it minimizes intra-class distances while maximizing inter-class distances. As a result, minimizing the cross-entropy can be seen as an approximate bound-optimization (or Majorize-Minimize) algorithm for minimizing this pairwise loss. Second, we show that, more generally, minimizing the cross-entropy is actually equivalent to maximizing the mutual information, to which we connect several well-known pairwise losses. Furthermore, we show that various standard pairwise losses can be explicitly related to one another via bound relationships. Our findings indicate that the cross-entropy represents a proxy for maximizing the mutual information -- as pairwise losses do -- without the need for convoluted sample-mining heuristics. Our experiments over four standard DML benchmarks strongly support our findings. We obtain state-of-the-art results, outperforming recent and complex DML methods. : ECCV 2020 (Spotlight) - Code available at: https://github.com/jeromerony/dml_cross_entropy Article in Journal/Newspaper DML DataCite Metadata Store (German National Library of Science and Technology)
institution	Open Polar
collection	DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id	ftdatacite
language	unknown
topic	Machine Learning cs.LG Computer Vision and Pattern Recognition cs.CV Machine Learning stat.ML FOS Computer and information sciences
spellingShingle	Machine Learning cs.LG Computer Vision and Pattern Recognition cs.CV Machine Learning stat.ML FOS Computer and information sciences Boudiaf, Malik Rony, Jérôme Ziko, Imtiaz Masud Granger, Eric Pedersoli, Marco Piantanida, Pablo Ayed, Ismail Ben A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses
topic_facet	Machine Learning cs.LG Computer Vision and Pattern Recognition cs.CV Machine Learning stat.ML FOS Computer and information sciences
description	Recently, substantial research efforts in Deep Metric Learning (DML) focused on designing complex pairwise-distance losses, which require convoluted schemes to ease optimization, such as sample mining or pair weighting. The standard cross-entropy loss for classification has been largely overlooked in DML. On the surface, the cross-entropy may seem unrelated and irrelevant to metric learning as it does not explicitly involve pairwise distances. However, we provide a theoretical analysis that links the cross-entropy to several well-known and recent pairwise losses. Our connections are drawn from two different perspectives: one based on an explicit optimization insight; the other on discriminative and generative views of the mutual information between the labels and the learned features. First, we explicitly demonstrate that the cross-entropy is an upper bound on a new pairwise loss, which has a structure similar to various pairwise losses: it minimizes intra-class distances while maximizing inter-class distances. As a result, minimizing the cross-entropy can be seen as an approximate bound-optimization (or Majorize-Minimize) algorithm for minimizing this pairwise loss. Second, we show that, more generally, minimizing the cross-entropy is actually equivalent to maximizing the mutual information, to which we connect several well-known pairwise losses. Furthermore, we show that various standard pairwise losses can be explicitly related to one another via bound relationships. Our findings indicate that the cross-entropy represents a proxy for maximizing the mutual information -- as pairwise losses do -- without the need for convoluted sample-mining heuristics. Our experiments over four standard DML benchmarks strongly support our findings. We obtain state-of-the-art results, outperforming recent and complex DML methods. : ECCV 2020 (Spotlight) - Code available at: https://github.com/jeromerony/dml_cross_entropy
format	Article in Journal/Newspaper
author	Boudiaf, Malik Rony, Jérôme Ziko, Imtiaz Masud Granger, Eric Pedersoli, Marco Piantanida, Pablo Ayed, Ismail Ben
author_facet	Boudiaf, Malik Rony, Jérôme Ziko, Imtiaz Masud Granger, Eric Pedersoli, Marco Piantanida, Pablo Ayed, Ismail Ben
author_sort	Boudiaf, Malik
title	A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses
title_short	A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses
title_full	A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses
title_fullStr	A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses
title_full_unstemmed	A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses
title_sort	unifying mutual information view of metric learning: cross-entropy vs. pairwise losses
publisher	arXiv
publishDate	2020
url	https://dx.doi.org/10.48550/arxiv.2003.08983 https://arxiv.org/abs/2003.08983
genre	DML
genre_facet	DML
op_rights	arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/
op_doi	https://doi.org/10.48550/arxiv.2003.08983
_version_	1766397333038170112

A unifying mutual information view of metric learning: cross-entropy vs. pairwise losses

Similar Items