xTML : a unified heterogeneous transfer metric learning framework for multimedia applications [application notes]

Owing to the continual growth of multimodal data (or feature spaces), we have seen a rising interest in multimedia applications (e.g., object classification and searching) over these heterogeneous data. However, the accuracy of classification and searching tasks is highly dependent on the distance e...

Full description

Bibliographic Details
Published in:IEEE Computational Intelligence Magazine
Main Authors: Liu, L., Luo, Yong, Hu, H., Wen, Yonggang, Tao, D., Yao, X.
Other Authors: School of Computer Science and Engineering
Format: Article in Journal/Newspaper
Language:English
Published: 2020
Subjects:
XML
DML
Online Access:https://hdl.handle.net/10356/154429
https://doi.org/10.1109/MCI.2020.2976187
Description
Summary:Owing to the continual growth of multimodal data (or feature spaces), we have seen a rising interest in multimedia applications (e.g., object classification and searching) over these heterogeneous data. However, the accuracy of classification and searching tasks is highly dependent on the distance estimation between data samples, and simple Euclidean (EU) distance has been proven to be inadequate. Previous research has focused on learning a robust distance metric to quantify the relationships among data samples. In this context, existing distance metric learning (DML) algorithms mainly leverage on label information in the target domain for model training and may fail when the label information is scarce. As an improvement, transfer metric learning (TML) approaches are proposed to leverage information from other related domains. However, current TML algorithms assume that different domains explore the same representation; thus, they are not applicable in heterogeneous settings where the data representations of different domains vary. In this research, we propose xTML, a novel unified heterogeneous transfer metric learning framework, to improve the distance estimation of the domains of interest (i.e., the target domains in classification and searching tasks) when limited label information, complementary with extensive unlabeled data, is provisioned for model training. We further illustrate how our proposed framework can be applied to a selected list of multimedia applications, including opinion mining, deception detection and online product searching. National Research Foundation (NRF) This research is supported in part by Singapore NRF2015ENC-GDCR01001-003, administrated via IMDA, NRF2015ENCGBICRD001-012, administrated via BCA, Youth Program of the National Social Science Fund of China under No.16CXW008, and National Natural Science Foundation of China (NSFC) under No. 61971457