Online Multimodal Deep Similarity Learning with Application to Image Retrieval

Recent years have witnessed extensive studies on distance metric learning (DML) for improving similarity search in multimedia information retrieval tasks. Despite their successes, most existing DML methods suffer from two critical limitations: (i) they typically attempt to learn a linear distance fu...

Full description

Bibliographic Details
Main Authors:	Pengcheng Wu, Steven C. H. Hoi, Hao Xia, Peilin Zhao, Dayong Wang, Chunyan Miao
Other Authors:	The Pennsylvania State University CiteSeerX Archives
Format:	Text
Language:	English
Subjects:	General Terms Algorithms Experimentation Keywords deep learning Handle The DML
Online Access:	http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.432.8126 http://www.cais.ntu.edu.sg/~chhoi/paper_pdf/p153-wu.pdf

id	ftciteseerx:oai:CiteSeerX.psu:10.1.1.432.8126
record_format	openpolar
spelling	ftciteseerx:oai:CiteSeerX.psu:10.1.1.432.8126 2023-05-15T16:01:41+02:00 Online Multimodal Deep Similarity Learning with Application to Image Retrieval Pengcheng Wu Steven C. H. Hoi Hao Xia Peilin Zhao Dayong Wang Chunyan Miao The Pennsylvania State University CiteSeerX Archives application/pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.432.8126 http://www.cais.ntu.edu.sg/~chhoi/paper_pdf/p153-wu.pdf en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.432.8126 http://www.cais.ntu.edu.sg/~chhoi/paper_pdf/p153-wu.pdf Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://www.cais.ntu.edu.sg/~chhoi/paper_pdf/p153-wu.pdf General Terms Algorithms Experimentation Keywords deep learning text ftciteseerx 2016-01-08T04:46:19Z Recent years have witnessed extensive studies on distance metric learning (DML) for improving similarity search in multimedia information retrieval tasks. Despite their successes, most existing DML methods suffer from two critical limitations: (i) they typically attempt to learn a linear distance function on the input feature space, in which the assumption of linearity limits their capacity of measuring the similarity on complex patterns in real-world applications; (ii) they are often designed for learning distance metrics on uni-modal data, which may not effectively handle the similarity measures for multimedia objects with multimodal representations. To address these limitations, in this paper, we propose a novel framework of online multimodal deep similarity learning (OMDSL), which aims to optimally integrate multiple deep neural networks pretrained with stacked denoising autoencoder. In particular, the proposed framework explores a unified two-stage online learning scheme that consists of (i) learning a flexible nonlinear transformation function for each individual modality, and (ii) learning to find the optimal combination of multiple diverse modalities simultaneously in a coherent process. We conduct an extensive set of experiments to evaluate the performance of the proposed algorithms for multimodal image retrieval tasks, in which the encouraging results validate the effectiveness of the proposed technique. Text DML Unknown Handle The ENVELOPE(161.983,161.983,-78.000,-78.000)
institution	Open Polar
collection	Unknown
op_collection_id	ftciteseerx
language	English
topic	General Terms Algorithms Experimentation Keywords deep learning
spellingShingle	General Terms Algorithms Experimentation Keywords deep learning Pengcheng Wu Steven C. H. Hoi Hao Xia Peilin Zhao Dayong Wang Chunyan Miao Online Multimodal Deep Similarity Learning with Application to Image Retrieval
topic_facet	General Terms Algorithms Experimentation Keywords deep learning
description	Recent years have witnessed extensive studies on distance metric learning (DML) for improving similarity search in multimedia information retrieval tasks. Despite their successes, most existing DML methods suffer from two critical limitations: (i) they typically attempt to learn a linear distance function on the input feature space, in which the assumption of linearity limits their capacity of measuring the similarity on complex patterns in real-world applications; (ii) they are often designed for learning distance metrics on uni-modal data, which may not effectively handle the similarity measures for multimedia objects with multimodal representations. To address these limitations, in this paper, we propose a novel framework of online multimodal deep similarity learning (OMDSL), which aims to optimally integrate multiple deep neural networks pretrained with stacked denoising autoencoder. In particular, the proposed framework explores a unified two-stage online learning scheme that consists of (i) learning a flexible nonlinear transformation function for each individual modality, and (ii) learning to find the optimal combination of multiple diverse modalities simultaneously in a coherent process. We conduct an extensive set of experiments to evaluate the performance of the proposed algorithms for multimodal image retrieval tasks, in which the encouraging results validate the effectiveness of the proposed technique.
author2	The Pennsylvania State University CiteSeerX Archives
format	Text
author	Pengcheng Wu Steven C. H. Hoi Hao Xia Peilin Zhao Dayong Wang Chunyan Miao
author_facet	Pengcheng Wu Steven C. H. Hoi Hao Xia Peilin Zhao Dayong Wang Chunyan Miao
author_sort	Pengcheng Wu
title	Online Multimodal Deep Similarity Learning with Application to Image Retrieval
title_short	Online Multimodal Deep Similarity Learning with Application to Image Retrieval
title_full	Online Multimodal Deep Similarity Learning with Application to Image Retrieval
title_fullStr	Online Multimodal Deep Similarity Learning with Application to Image Retrieval
title_full_unstemmed	Online Multimodal Deep Similarity Learning with Application to Image Retrieval
title_sort	online multimodal deep similarity learning with application to image retrieval
url	http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.432.8126 http://www.cais.ntu.edu.sg/~chhoi/paper_pdf/p153-wu.pdf
long_lat	ENVELOPE(161.983,161.983,-78.000,-78.000)
geographic	Handle The
geographic_facet	Handle The
genre	DML
genre_facet	DML
op_source	http://www.cais.ntu.edu.sg/~chhoi/paper_pdf/p153-wu.pdf
op_relation	http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.432.8126 http://www.cais.ntu.edu.sg/~chhoi/paper_pdf/p153-wu.pdf
op_rights	Metadata may be used without restrictions as long as the oai identifier remains attached to it.
_version_	1766397442493775872

Online Multimodal Deep Similarity Learning with Application to Image Retrieval

Similar Items