Distance Metric Learning Using Dropout: A Structured Regularization Approach

Distance metric learning (DML) aims to learn a distance metric better than Euclidean distance. It has been success-fully applied to various tasks, e.g., classification, cluster-ing and information retrieval. Many DML algorithms suffer from the over-fitting problem because of a large number of parame...

Full description

Bibliographic Details
Main Authors: Qi Qian, Juhua Hu, Rong Jin, Jian Pei, Shenghuo Zhu
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Subjects:
DML
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.565.1706
http://www.cs.sfu.ca/~jpei/publications/Distance Metric Dropout.pdf
Description
Summary:Distance metric learning (DML) aims to learn a distance metric better than Euclidean distance. It has been success-fully applied to various tasks, e.g., classification, cluster-ing and information retrieval. Many DML algorithms suffer from the over-fitting problem because of a large number of parameters to be determined in DML. In this paper, we exploit the dropout technique, which has been successfully applied in deep learning to alleviate the over-fitting prob-lem, for DML. Different from the previous studies that only apply dropout to training data, we apply dropout to both the learned metrics and the training data. We illustrate that application of dropout to DML is essentially equivalent to matrix norm based regularization. Compared with the standard regularization scheme in DML, dropout is advan-tageous in simulating the structured regularizers which have shown consistently better performance than non structured regularizers. We verify, both empirically and theoretically, that dropout is effective in regulating the learned metric to avoid the over-fitting problem. Last, we examine the idea of wrapping the dropout technique in the state-of-art DML methods and observe that the dropout technique can significantly improve the performance of the original DML methods.