Fine-Grained Visual Categorization via Multi-stage Metric Learning

Fine-grained visual categorization (FGVC) is to categorize objects into subordinate classes instead of basic classes. One major challenge in FGVC is the co-occurrence of two issues: 1) many subordinate classes are highly correlated and are difficult to distinguish, and 2) there exists the large intr...

Full description

Bibliographic Details
Main Authors: Qian, Qi, Jin, Rong, Zhu, Shenghuo, Lin, Yuanqing
Format: Text
Language:unknown
Published: 2014
Subjects:
DML
Online Access:http://arxiv.org/abs/1402.0453
id ftarxivpreprints:oai:arXiv.org:1402.0453
record_format openpolar
spelling ftarxivpreprints:oai:arXiv.org:1402.0453 2023-09-05T13:19:05+02:00 Fine-Grained Visual Categorization via Multi-stage Metric Learning Qian, Qi Jin, Rong Zhu, Shenghuo Lin, Yuanqing 2014-02-03 http://arxiv.org/abs/1402.0453 unknown http://arxiv.org/abs/1402.0453 Computer Science - Computer Vision and Pattern Recognition Computer Science - Machine Learning Statistics - Machine Learning text 2014 ftarxivpreprints 2023-08-16T13:14:44Z Fine-grained visual categorization (FGVC) is to categorize objects into subordinate classes instead of basic classes. One major challenge in FGVC is the co-occurrence of two issues: 1) many subordinate classes are highly correlated and are difficult to distinguish, and 2) there exists the large intra-class variation (e.g., due to object pose). This paper proposes to explicitly address the above two issues via distance metric learning (DML). DML addresses the first issue by learning an embedding so that data points from the same class will be pulled together while those from different classes should be pushed apart from each other; and it addresses the second issue by allowing the flexibility that only a portion of the neighbors (not all data points) from the same class need to be pulled together. However, feature representation of an image is often high dimensional, and DML is known to have difficulty in dealing with high dimensional feature vectors since it would require $\mathcal{O}(d^2)$ for storage and $\mathcal{O}(d^3)$ for optimization. To this end, we proposed a multi-stage metric learning framework that divides the large-scale high dimensional learning problem to a series of simple subproblems, achieving $\mathcal{O}(d)$ computational complexity. The empirical study with FVGC benchmark datasets verifies that our method is both effective and efficient compared to the state-of-the-art FGVC approaches. Comment: in CVPR 2015 Text DML ArXiv.org (Cornell University Library)
institution Open Polar
collection ArXiv.org (Cornell University Library)
op_collection_id ftarxivpreprints
language unknown
topic Computer Science - Computer Vision and Pattern Recognition
Computer Science - Machine Learning
Statistics - Machine Learning
spellingShingle Computer Science - Computer Vision and Pattern Recognition
Computer Science - Machine Learning
Statistics - Machine Learning
Qian, Qi
Jin, Rong
Zhu, Shenghuo
Lin, Yuanqing
Fine-Grained Visual Categorization via Multi-stage Metric Learning
topic_facet Computer Science - Computer Vision and Pattern Recognition
Computer Science - Machine Learning
Statistics - Machine Learning
description Fine-grained visual categorization (FGVC) is to categorize objects into subordinate classes instead of basic classes. One major challenge in FGVC is the co-occurrence of two issues: 1) many subordinate classes are highly correlated and are difficult to distinguish, and 2) there exists the large intra-class variation (e.g., due to object pose). This paper proposes to explicitly address the above two issues via distance metric learning (DML). DML addresses the first issue by learning an embedding so that data points from the same class will be pulled together while those from different classes should be pushed apart from each other; and it addresses the second issue by allowing the flexibility that only a portion of the neighbors (not all data points) from the same class need to be pulled together. However, feature representation of an image is often high dimensional, and DML is known to have difficulty in dealing with high dimensional feature vectors since it would require $\mathcal{O}(d^2)$ for storage and $\mathcal{O}(d^3)$ for optimization. To this end, we proposed a multi-stage metric learning framework that divides the large-scale high dimensional learning problem to a series of simple subproblems, achieving $\mathcal{O}(d)$ computational complexity. The empirical study with FVGC benchmark datasets verifies that our method is both effective and efficient compared to the state-of-the-art FGVC approaches. Comment: in CVPR 2015
format Text
author Qian, Qi
Jin, Rong
Zhu, Shenghuo
Lin, Yuanqing
author_facet Qian, Qi
Jin, Rong
Zhu, Shenghuo
Lin, Yuanqing
author_sort Qian, Qi
title Fine-Grained Visual Categorization via Multi-stage Metric Learning
title_short Fine-Grained Visual Categorization via Multi-stage Metric Learning
title_full Fine-Grained Visual Categorization via Multi-stage Metric Learning
title_fullStr Fine-Grained Visual Categorization via Multi-stage Metric Learning
title_full_unstemmed Fine-Grained Visual Categorization via Multi-stage Metric Learning
title_sort fine-grained visual categorization via multi-stage metric learning
publishDate 2014
url http://arxiv.org/abs/1402.0453
genre DML
genre_facet DML
op_relation http://arxiv.org/abs/1402.0453
_version_ 1776199903752486912