Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects ...

We interact with the world with our hands and see it through our own (egocentric) perspective. A holistic 3D understanding of such interactions from egocentric views is important for tasks in robotics, AR/VR, action recognition and motion generation. Accurately reconstructing such interactions in 3D...

Full description

Bibliographic Details
Main Authors: Fan, Zicong, Ohkawa, Takehiko, Yang, Linlin, Lin, Nie, Zhou, Zhishan, Zhou, Shihao, Liang, Jiajun, Gao, Zhong, Zhang, Xuanyang, Zhang, Xue, Li, Fei, Zheng, Liu, Lu, Feng, Zeid, Karim Abou, Leibe, Bastian, On, Jeongwan, Baek, Seungryul, Prakash, Aditya, Gupta, Saurabh, He, Kun, Sato, Yoichi, Hilliges, Otmar, Chang, Hyung Jin, Yao, Angela
Format: Article in Journal/Newspaper
Language:unknown
Published: arXiv 2024
Subjects:
Online Access:https://dx.doi.org/10.48550/arxiv.2403.16428
https://arxiv.org/abs/2403.16428
Description
Summary:We interact with the world with our hands and see it through our own (egocentric) perspective. A holistic 3D understanding of such interactions from egocentric views is important for tasks in robotics, AR/VR, action recognition and motion generation. Accurately reconstructing such interactions in 3D is challenging due to heavy occlusion, viewpoint bias, camera distortion, and motion blur from the head movement. To this end, we designed the HANDS23 challenge based on the AssemblyHands and ARCTIC datasets with carefully designed training and testing splits. Based on the results of the top submitted methods and more recent baselines on the leaderboards, we perform a thorough analysis on 3D hand(-object) reconstruction tasks. Our analysis demonstrates the effectiveness of addressing distortion specific to egocentric cameras, adopting high-capacity transformers to learn complex hand-object interactions, and fusing predictions from different views. Our study further reveals challenging scenarios intractable with ...