Distance metric learning for multi-modal image retrieval and annotation

With the rapid growth of digital cameras and photo sharing websites, content-based image retrieval (CBIR) and search-based image annotation are important techniques for many real-world multimedia applications. They remain open challenges today, despite being studied extensively for a few decades in...

Full description

Bibliographic Details
Main Author: Wu, Pengcheng
Other Authors: Hoi Chu Hong, School of Computer Engineering, Centre for Computational Intelligence
Format: Thesis
Language:English
Published: 2014
Subjects:
DML
Online Access:http://hdl.handle.net/10356/60499
id ftnanyangtu:oai:dr.ntu.edu.sg:10356/60499
record_format openpolar
spelling ftnanyangtu:oai:dr.ntu.edu.sg:10356/60499 2023-05-15T16:02:08+02:00 Distance metric learning for multi-modal image retrieval and annotation Wu, Pengcheng Hoi Chu Hong School of Computer Engineering Centre for Computational Intelligence 2014 163 p. application/pdf http://hdl.handle.net/10356/60499 en eng http://hdl.handle.net/10356/60499 DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Thesis 2014 ftnanyangtu 2023-03-10T01:21:44Z With the rapid growth of digital cameras and photo sharing websites, content-based image retrieval (CBIR) and search-based image annotation are important techniques for many real-world multimedia applications. They remain open challenges today, despite being studied extensively for a few decades in several communities, including multimedia, signal processing, and computer vision. One key challenge of CBIR is to find an effective similarity search scheme to accurately retrieve a short list of most similar images from a massive collection of images. The conventional CBIR approaches usually adopt rigid measures to evaluate similarity of images, such as the classical Euclidean distance or cosine similarity, which are often limited despite being widely used in many applications. In this thesis, we investigate Distance Metric Learning (DML) techniques to improve visual similarity search in multimedia information retrieval tasks. In particular, we propose three kinds of novel machine learning algorithms to tackle the challenges of content-based image retrieval and search-based image annotation. Firstly, we present a novel Unified Distance Metric Learning (UDML) scheme for mining social images towards automated image annotation. To effectively discover knowledge from social images that are often associated with multimedia contents (including visual images and textual tags), UDML not only exploits both visual and textual contents of social images, but also effectively unifies both inductive and transductive metric learning techniques in a systematic learning framework. The UDML task is formulated as a convex optimization problem, i.e., a Semi-Definite Program (SDP) which is in general difficult to solve. To overcome the challenging optimization task of UDML, we develop an efficient stochastic gradient descent algorithm for solving the optimization task and prove the convergence of the proposed algorithm. By applying the UDML technique to the search-based image annotation task on a large real-world testbed in our ... Thesis DML DR-NTU (Digital Repository at Nanyang Technological University, Singapore)
institution Open Polar
collection DR-NTU (Digital Repository at Nanyang Technological University, Singapore)
op_collection_id ftnanyangtu
language English
topic DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
spellingShingle DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Wu, Pengcheng
Distance metric learning for multi-modal image retrieval and annotation
topic_facet DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
description With the rapid growth of digital cameras and photo sharing websites, content-based image retrieval (CBIR) and search-based image annotation are important techniques for many real-world multimedia applications. They remain open challenges today, despite being studied extensively for a few decades in several communities, including multimedia, signal processing, and computer vision. One key challenge of CBIR is to find an effective similarity search scheme to accurately retrieve a short list of most similar images from a massive collection of images. The conventional CBIR approaches usually adopt rigid measures to evaluate similarity of images, such as the classical Euclidean distance or cosine similarity, which are often limited despite being widely used in many applications. In this thesis, we investigate Distance Metric Learning (DML) techniques to improve visual similarity search in multimedia information retrieval tasks. In particular, we propose three kinds of novel machine learning algorithms to tackle the challenges of content-based image retrieval and search-based image annotation. Firstly, we present a novel Unified Distance Metric Learning (UDML) scheme for mining social images towards automated image annotation. To effectively discover knowledge from social images that are often associated with multimedia contents (including visual images and textual tags), UDML not only exploits both visual and textual contents of social images, but also effectively unifies both inductive and transductive metric learning techniques in a systematic learning framework. The UDML task is formulated as a convex optimization problem, i.e., a Semi-Definite Program (SDP) which is in general difficult to solve. To overcome the challenging optimization task of UDML, we develop an efficient stochastic gradient descent algorithm for solving the optimization task and prove the convergence of the proposed algorithm. By applying the UDML technique to the search-based image annotation task on a large real-world testbed in our ...
author2 Hoi Chu Hong
School of Computer Engineering
Centre for Computational Intelligence
format Thesis
author Wu, Pengcheng
author_facet Wu, Pengcheng
author_sort Wu, Pengcheng
title Distance metric learning for multi-modal image retrieval and annotation
title_short Distance metric learning for multi-modal image retrieval and annotation
title_full Distance metric learning for multi-modal image retrieval and annotation
title_fullStr Distance metric learning for multi-modal image retrieval and annotation
title_full_unstemmed Distance metric learning for multi-modal image retrieval and annotation
title_sort distance metric learning for multi-modal image retrieval and annotation
publishDate 2014
url http://hdl.handle.net/10356/60499
genre DML
genre_facet DML
op_relation http://hdl.handle.net/10356/60499
_version_ 1766397734459277312