Get a Grip: Reconstructing Hand-Object Stable Grasps in Egocentric Videos

We address in-the-wild hand-object reconstruction for a known object category in egocentric videos, focusing on temporal periods of stable grasps. We propose the task of Hand-Object Stable Grasp Reconstruction (HO-SGR), the joint reconstruction of frames during which the hand is stably holding the o...

Full description

Bibliographic Details
Main Authors: Zhu, Zhifan, Damen, Dima
Format: Text
Language:unknown
Published: 2023
Subjects:
Online Access:http://arxiv.org/abs/2312.15719
Description
Summary:We address in-the-wild hand-object reconstruction for a known object category in egocentric videos, focusing on temporal periods of stable grasps. We propose the task of Hand-Object Stable Grasp Reconstruction (HO-SGR), the joint reconstruction of frames during which the hand is stably holding the object. We thus can constrain the object motion relative to the hand, effectively regularising the reconstruction and improving performance. By analysing the 3D ARCTIC dataset, we identify temporal periods where the contact area between the hand and object vertices remain stable. We showcase that objects within stable grasps move within a single degree of freedom (1~DoF). We thus propose a method for jointly optimising all frames within a stable grasp by minimising the object's rotation to that within a latent 1 DoF. We then extend this knowledge to in-the-wild egocentric videos by labelling 2.4K clips of stable grasps from the EPIC-KITCHENS dataset. Our proposed EPIC-Grasps dataset includes 390 object instances of 9 categories, featuring stable grasps from videos of daily interactions in 141 environments. Our method achieves significantly better HO-SGR, both qualitatively and by computing the stable grasp area and 2D projection labels of mask overlaps. Comment: webpage: https://zhifanzhu.github.io/getagrip