Skeleton-DML: Deep Metric Learning for Skeleton-Based One-Shot Action Recognition
One-shot action recognition allows the recognition of human-performed actions with only a single training example. This can influence human-robot-interaction positively by enabling the robot to react to previously unseen behaviour. We formulate the one-shot action recognition problem as a deep metri...
Main Authors: | , , , |
---|---|
Format: | Text |
Language: | unknown |
Published: |
2020
|
Subjects: | |
Online Access: | http://arxiv.org/abs/2012.13823 |
id |
ftarxivpreprints:oai:arXiv.org:2012.13823 |
---|---|
record_format |
openpolar |
spelling |
ftarxivpreprints:oai:arXiv.org:2012.13823 2023-09-05T13:19:06+02:00 Skeleton-DML: Deep Metric Learning for Skeleton-Based One-Shot Action Recognition Memmesheimer, Raphael Häring, Simon Theisen, Nick Paulus, Dietrich 2020-12-26 http://arxiv.org/abs/2012.13823 unknown http://arxiv.org/abs/2012.13823 Computer Science - Computer Vision and Pattern Recognition Computer Science - Artificial Intelligence Computer Science - Robotics text 2020 ftarxivpreprints 2023-08-16T16:16:10Z One-shot action recognition allows the recognition of human-performed actions with only a single training example. This can influence human-robot-interaction positively by enabling the robot to react to previously unseen behaviour. We formulate the one-shot action recognition problem as a deep metric learning problem and propose a novel image-based skeleton representation that performs well in a metric learning setting. Therefore, we train a model that projects the image representations into an embedding space. In embedding space the similar actions have a low euclidean distance while dissimilar actions have a higher distance. The one-shot action recognition problem becomes a nearest-neighbor search in a set of activity reference samples. We evaluate the performance of our proposed representation against a variety of other skeleton-based image representations. In addition, we present an ablation study that shows the influence of different embedding vector sizes, losses and augmentation. Our approach lifts the state-of-the-art by 3.3% for the one-shot action recognition protocol on the NTU RGB+D 120 dataset under a comparable training setup. With additional augmentation our result improved over 7.7%. Comment: 8 pages, 8 figures, 4 tables Text DML ArXiv.org (Cornell University Library) |
institution |
Open Polar |
collection |
ArXiv.org (Cornell University Library) |
op_collection_id |
ftarxivpreprints |
language |
unknown |
topic |
Computer Science - Computer Vision and Pattern Recognition Computer Science - Artificial Intelligence Computer Science - Robotics |
spellingShingle |
Computer Science - Computer Vision and Pattern Recognition Computer Science - Artificial Intelligence Computer Science - Robotics Memmesheimer, Raphael Häring, Simon Theisen, Nick Paulus, Dietrich Skeleton-DML: Deep Metric Learning for Skeleton-Based One-Shot Action Recognition |
topic_facet |
Computer Science - Computer Vision and Pattern Recognition Computer Science - Artificial Intelligence Computer Science - Robotics |
description |
One-shot action recognition allows the recognition of human-performed actions with only a single training example. This can influence human-robot-interaction positively by enabling the robot to react to previously unseen behaviour. We formulate the one-shot action recognition problem as a deep metric learning problem and propose a novel image-based skeleton representation that performs well in a metric learning setting. Therefore, we train a model that projects the image representations into an embedding space. In embedding space the similar actions have a low euclidean distance while dissimilar actions have a higher distance. The one-shot action recognition problem becomes a nearest-neighbor search in a set of activity reference samples. We evaluate the performance of our proposed representation against a variety of other skeleton-based image representations. In addition, we present an ablation study that shows the influence of different embedding vector sizes, losses and augmentation. Our approach lifts the state-of-the-art by 3.3% for the one-shot action recognition protocol on the NTU RGB+D 120 dataset under a comparable training setup. With additional augmentation our result improved over 7.7%. Comment: 8 pages, 8 figures, 4 tables |
format |
Text |
author |
Memmesheimer, Raphael Häring, Simon Theisen, Nick Paulus, Dietrich |
author_facet |
Memmesheimer, Raphael Häring, Simon Theisen, Nick Paulus, Dietrich |
author_sort |
Memmesheimer, Raphael |
title |
Skeleton-DML: Deep Metric Learning for Skeleton-Based One-Shot Action Recognition |
title_short |
Skeleton-DML: Deep Metric Learning for Skeleton-Based One-Shot Action Recognition |
title_full |
Skeleton-DML: Deep Metric Learning for Skeleton-Based One-Shot Action Recognition |
title_fullStr |
Skeleton-DML: Deep Metric Learning for Skeleton-Based One-Shot Action Recognition |
title_full_unstemmed |
Skeleton-DML: Deep Metric Learning for Skeleton-Based One-Shot Action Recognition |
title_sort |
skeleton-dml: deep metric learning for skeleton-based one-shot action recognition |
publishDate |
2020 |
url |
http://arxiv.org/abs/2012.13823 |
genre |
DML |
genre_facet |
DML |
op_relation |
http://arxiv.org/abs/2012.13823 |
_version_ |
1776199924526874624 |