Spatial Gating with Hybrid Receptive Field for Robot Visual Localization

Abstract Visual localization for mobile robots is a sophisticated and challenging task that necessitates the extraction of pertinent scene information from images obtained by the robot’s visual sensors to ascertain its position within an environment. The task is complicated by variations in environm...

Full description

Bibliographic Details
Published in:International Journal of Computational Intelligence Systems
Main Authors: Shuhong Zhou, Junjun Wu, Qinghua Lu
Format: Article in Journal/Newspaper
Language:English
Published: Springer 2024
Subjects:
Online Access:https://doi.org/10.1007/s44196-024-00501-z
https://doaj.org/article/1dad291f5f6f460da07c999fa0bc62bc
Description
Summary:Abstract Visual localization for mobile robots is a sophisticated and challenging task that necessitates the extraction of pertinent scene information from images obtained by the robot’s visual sensors to ascertain its position within an environment. The task is complicated by variations in environmental factors which affect the accuracy of localization. To address the challenges of visual localization on variations of illumination, seasons, and viewpoints, this paper proposes a visual localization network based on a gated selection and hybrid receptive field. We utilize a fine-tuned DINOv2 for local feature extraction and leverage a hybrid receptive field to enhance the diversity of visual features. Furthermore, our approach employs spatial gating to dynamically and effectively select and aggregate the advantageous spatial features. Extensive experiments demonstrate that the visual localization performance of our approach surpasses existing methods on multiple challenging datasets, particularly achieving a Recall@1 metric of 69.2% on the NordLand dataset, which signifies a 10.8% enhancement compared to MixVPR.