Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning ...

Deep Metric Learning (DML) has long attracted the attention of the machine learning community as a key objective. Existing solutions concentrate on fine-tuning the pre-trained models on conventional image datasets. As a result of the success of recent pre-trained models trained from larger-scale dat...

Full description

Bibliographic Details
Main Authors:	Ren, Li, Chen, Chen, Wang, Liqiang, Hua, Kien
Format:	Article in Journal/Newspaper
Language:	unknown
Published:	arXiv 2024
Subjects:	Computer Vision and Pattern Recognition cs.CV Machine Learning cs.LG FOS Computer and information sciences DML
Online Access:	https://dx.doi.org/10.48550/arxiv.2402.02340 https://arxiv.org/abs/2402.02340

id	ftdatacite:10.48550/arxiv.2402.02340
record_format	openpolar
spelling	ftdatacite:10.48550/arxiv.2402.02340 2024-04-28T08:17:05+00:00 Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning ... Ren, Li Chen, Chen Wang, Liqiang Hua, Kien 2024 https://dx.doi.org/10.48550/arxiv.2402.02340 https://arxiv.org/abs/2402.02340 unknown arXiv arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/ Computer Vision and Pattern Recognition cs.CV Machine Learning cs.LG FOS Computer and information sciences article Article Preprint CreativeWork 2024 ftdatacite https://doi.org/10.48550/arxiv.2402.02340 2024-04-02T11:39:34Z Deep Metric Learning (DML) has long attracted the attention of the machine learning community as a key objective. Existing solutions concentrate on fine-tuning the pre-trained models on conventional image datasets. As a result of the success of recent pre-trained models trained from larger-scale datasets, it is challenging to adapt the model to the DML tasks in the local data domain while retaining the previously gained knowledge. In this paper, we investigate parameter-efficient methods for fine-tuning the pre-trained model for DML tasks. In particular, we propose a novel and effective framework based on learning Visual Prompts (VPT) in the pre-trained Vision Transformers (ViT). Based on the conventional proxy-based DML paradigm, we augment the proxy by incorporating the semantic information from the input image and the ViT, in which we optimize the visual prompts for each class. We demonstrate that our new approximations with semantic information are superior to representative capabilities, thereby ... : Published in ICLR 2024 ... Article in Journal/Newspaper DML DataCite Metadata Store (German National Library of Science and Technology)
institution	Open Polar
collection	DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id	ftdatacite
language	unknown
topic	Computer Vision and Pattern Recognition cs.CV Machine Learning cs.LG FOS Computer and information sciences
spellingShingle	Computer Vision and Pattern Recognition cs.CV Machine Learning cs.LG FOS Computer and information sciences Ren, Li Chen, Chen Wang, Liqiang Hua, Kien Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning ...
topic_facet	Computer Vision and Pattern Recognition cs.CV Machine Learning cs.LG FOS Computer and information sciences
description	Deep Metric Learning (DML) has long attracted the attention of the machine learning community as a key objective. Existing solutions concentrate on fine-tuning the pre-trained models on conventional image datasets. As a result of the success of recent pre-trained models trained from larger-scale datasets, it is challenging to adapt the model to the DML tasks in the local data domain while retaining the previously gained knowledge. In this paper, we investigate parameter-efficient methods for fine-tuning the pre-trained model for DML tasks. In particular, we propose a novel and effective framework based on learning Visual Prompts (VPT) in the pre-trained Vision Transformers (ViT). Based on the conventional proxy-based DML paradigm, we augment the proxy by incorporating the semantic information from the input image and the ViT, in which we optimize the visual prompts for each class. We demonstrate that our new approximations with semantic information are superior to representative capabilities, thereby ... : Published in ICLR 2024 ...
format	Article in Journal/Newspaper
author	Ren, Li Chen, Chen Wang, Liqiang Hua, Kien
author_facet	Ren, Li Chen, Chen Wang, Liqiang Hua, Kien
author_sort	Ren, Li
title	Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning ...
title_short	Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning ...
title_full	Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning ...
title_fullStr	Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning ...
title_full_unstemmed	Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning ...
title_sort	learning semantic proxies from visual prompts for parameter-efficient fine-tuning in deep metric learning ...
publisher	arXiv
publishDate	2024
url	https://dx.doi.org/10.48550/arxiv.2402.02340 https://arxiv.org/abs/2402.02340
genre	DML
genre_facet	DML
op_rights	arXiv.org perpetual, non-exclusive license http://arxiv.org/licenses/nonexclusive-distrib/1.0/
op_doi	https://doi.org/10.48550/arxiv.2402.02340
_version_	1797581882594951168

Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning ...

Similar Items