Deep Metric Learning with Soft Orthogonal Proxies
Deep Metric Learning (DML) models rely on strong representations and similarity-based measures with specific loss functions. Proxy-based losses have shown great performance compared to pair-based losses in terms of convergence speed. However, proxies that are assigned to different classes may end up...
Main Authors: | , , , , , , , |
---|---|
Format: | Text |
Language: | unknown |
Published: |
2023
|
Subjects: | |
Online Access: | http://arxiv.org/abs/2306.13055 |
id |
ftarxivpreprints:oai:arXiv.org:2306.13055 |
---|---|
record_format |
openpolar |
spelling |
ftarxivpreprints:oai:arXiv.org:2306.13055 2023-09-05T13:19:06+02:00 Deep Metric Learning with Soft Orthogonal Proxies Saberi-Movahed, Farshad Ebrahimpour, Mohammad K. Saberi-Movahed, Farid Moshavash, Monireh Rahmatian, Dorsa Mohazzebi, Mahvash Shariatzadeh, Mahdi Eftekhari, Mahdi 2023-06-22 http://arxiv.org/abs/2306.13055 unknown http://arxiv.org/abs/2306.13055 Computer Science - Computer Vision and Pattern Recognition text 2023 ftarxivpreprints 2023-08-16T17:47:07Z Deep Metric Learning (DML) models rely on strong representations and similarity-based measures with specific loss functions. Proxy-based losses have shown great performance compared to pair-based losses in terms of convergence speed. However, proxies that are assigned to different classes may end up being closely located in the embedding space and hence having a hard time to distinguish between positive and negative items. Alternatively, they may become highly correlated and hence provide redundant information with the model. To address these issues, we propose a novel approach that introduces Soft Orthogonality (SO) constraint on proxies. The constraint ensures the proxies to be as orthogonal as possible and hence control their positions in the embedding space. Our approach leverages Data-Efficient Image Transformer (DeiT) as an encoder to extract contextual features from images along with a DML objective. The objective is made of the Proxy Anchor loss along with the SO regularization. We evaluate our method on four public benchmarks for category-level image retrieval and demonstrate its effectiveness with comprehensive experimental results and ablation studies. Our evaluations demonstrate the superiority of our proposed approach over state-of-the-art methods by a significant margin. Text DML ArXiv.org (Cornell University Library) |
institution |
Open Polar |
collection |
ArXiv.org (Cornell University Library) |
op_collection_id |
ftarxivpreprints |
language |
unknown |
topic |
Computer Science - Computer Vision and Pattern Recognition |
spellingShingle |
Computer Science - Computer Vision and Pattern Recognition Saberi-Movahed, Farshad Ebrahimpour, Mohammad K. Saberi-Movahed, Farid Moshavash, Monireh Rahmatian, Dorsa Mohazzebi, Mahvash Shariatzadeh, Mahdi Eftekhari, Mahdi Deep Metric Learning with Soft Orthogonal Proxies |
topic_facet |
Computer Science - Computer Vision and Pattern Recognition |
description |
Deep Metric Learning (DML) models rely on strong representations and similarity-based measures with specific loss functions. Proxy-based losses have shown great performance compared to pair-based losses in terms of convergence speed. However, proxies that are assigned to different classes may end up being closely located in the embedding space and hence having a hard time to distinguish between positive and negative items. Alternatively, they may become highly correlated and hence provide redundant information with the model. To address these issues, we propose a novel approach that introduces Soft Orthogonality (SO) constraint on proxies. The constraint ensures the proxies to be as orthogonal as possible and hence control their positions in the embedding space. Our approach leverages Data-Efficient Image Transformer (DeiT) as an encoder to extract contextual features from images along with a DML objective. The objective is made of the Proxy Anchor loss along with the SO regularization. We evaluate our method on four public benchmarks for category-level image retrieval and demonstrate its effectiveness with comprehensive experimental results and ablation studies. Our evaluations demonstrate the superiority of our proposed approach over state-of-the-art methods by a significant margin. |
format |
Text |
author |
Saberi-Movahed, Farshad Ebrahimpour, Mohammad K. Saberi-Movahed, Farid Moshavash, Monireh Rahmatian, Dorsa Mohazzebi, Mahvash Shariatzadeh, Mahdi Eftekhari, Mahdi |
author_facet |
Saberi-Movahed, Farshad Ebrahimpour, Mohammad K. Saberi-Movahed, Farid Moshavash, Monireh Rahmatian, Dorsa Mohazzebi, Mahvash Shariatzadeh, Mahdi Eftekhari, Mahdi |
author_sort |
Saberi-Movahed, Farshad |
title |
Deep Metric Learning with Soft Orthogonal Proxies |
title_short |
Deep Metric Learning with Soft Orthogonal Proxies |
title_full |
Deep Metric Learning with Soft Orthogonal Proxies |
title_fullStr |
Deep Metric Learning with Soft Orthogonal Proxies |
title_full_unstemmed |
Deep Metric Learning with Soft Orthogonal Proxies |
title_sort |
deep metric learning with soft orthogonal proxies |
publishDate |
2023 |
url |
http://arxiv.org/abs/2306.13055 |
genre |
DML |
genre_facet |
DML |
op_relation |
http://arxiv.org/abs/2306.13055 |
_version_ |
1776199908615782400 |