MVP: Unified Motion and Visual Self-Supervised Learning for Large-Scale Robotic Navigation

Autonomous navigation emerges from both motion and local visual perception in real-world environments. However, most successful robotic motion estimation methods (e.g. VO, SLAM, SfM) and vision systems (e.g. CNN, visual place recognition-VPR) are often separately used for mapping and localization ta...

Full description

Bibliographic Details
Main Authors:	Chancán, Marvin, Milford, Michael
Format:	Article in Journal/Newspaper
Language:	unknown
Published:	arXiv 2020
Subjects:	Robotics cs.RO Computer Vision and Pattern Recognition cs.CV Machine Learning cs.LG FOS Computer and information sciences Nordland
Online Access:	https://dx.doi.org/10.48550/arxiv.2003.00667 https://arxiv.org/abs/2003.00667

id	ftdatacite:10.48550/arxiv.2003.00667
record_format	openpolar
spelling	ftdatacite:10.48550/arxiv.2003.00667 2023-05-15T17:24:41+02:00 MVP: Unified Motion and Visual Self-Supervised Learning for Large-Scale Robotic Navigation Chancán, Marvin Milford, Michael 2020 https://dx.doi.org/10.48550/arxiv.2003.00667 https://arxiv.org/abs/2003.00667 unknown arXiv Creative Commons Attribution Non Commercial Share Alike 4.0 International https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode cc-by-nc-sa-4.0 CC-BY-NC-SA Robotics cs.RO Computer Vision and Pattern Recognition cs.CV Machine Learning cs.LG FOS Computer and information sciences Article CreativeWork article Preprint 2020 ftdatacite https://doi.org/10.48550/arxiv.2003.00667 2022-03-10T15:51:31Z Autonomous navigation emerges from both motion and local visual perception in real-world environments. However, most successful robotic motion estimation methods (e.g. VO, SLAM, SfM) and vision systems (e.g. CNN, visual place recognition-VPR) are often separately used for mapping and localization tasks. Conversely, recent reinforcement learning (RL) based methods for visual navigation rely on the quality of GPS data reception, which may not be reliable when directly using it as ground truth across multiple, month-spaced traversals in large environments. In this paper, we propose a novel motion and visual perception approach, dubbed MVP, that unifies these two sensor modalities for large-scale, target-driven navigation tasks. Our MVP-based method can learn faster, and is more accurate and robust to both extreme environmental changes and poor GPS data than corresponding vision-only navigation methods. MVP temporally incorporates compact image representations, obtained using VPR, with optimized motion estimation data, including but not limited to those from VO or optimized radar odometry (RO), to efficiently learn self-supervised navigation policies via RL. We evaluate our method on two large real-world datasets, Oxford Robotcar and Nordland Railway, over a range of weather (e.g. overcast, night, snow, sun, rain, clouds) and seasonal (e.g. winter, spring, fall, summer) conditions using the new CityLearn framework; an interactive environment for efficiently training navigation agents. Our experimental results, on traversals of the Oxford RobotCar dataset with no GPS data, show that MVP can achieve 53% and 93% navigation success rate using VO and RO, respectively, compared to 7% for a vision-only method. We additionally report a trade-off between the RL success rate and the motion estimation precision. : Under review at IROS 2020 Article in Journal/Newspaper Nordland Nordland Nordland DataCite Metadata Store (German National Library of Science and Technology)
institution	Open Polar
collection	DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id	ftdatacite
language	unknown
topic	Robotics cs.RO Computer Vision and Pattern Recognition cs.CV Machine Learning cs.LG FOS Computer and information sciences
spellingShingle	Robotics cs.RO Computer Vision and Pattern Recognition cs.CV Machine Learning cs.LG FOS Computer and information sciences Chancán, Marvin Milford, Michael MVP: Unified Motion and Visual Self-Supervised Learning for Large-Scale Robotic Navigation
topic_facet	Robotics cs.RO Computer Vision and Pattern Recognition cs.CV Machine Learning cs.LG FOS Computer and information sciences
description	Autonomous navigation emerges from both motion and local visual perception in real-world environments. However, most successful robotic motion estimation methods (e.g. VO, SLAM, SfM) and vision systems (e.g. CNN, visual place recognition-VPR) are often separately used for mapping and localization tasks. Conversely, recent reinforcement learning (RL) based methods for visual navigation rely on the quality of GPS data reception, which may not be reliable when directly using it as ground truth across multiple, month-spaced traversals in large environments. In this paper, we propose a novel motion and visual perception approach, dubbed MVP, that unifies these two sensor modalities for large-scale, target-driven navigation tasks. Our MVP-based method can learn faster, and is more accurate and robust to both extreme environmental changes and poor GPS data than corresponding vision-only navigation methods. MVP temporally incorporates compact image representations, obtained using VPR, with optimized motion estimation data, including but not limited to those from VO or optimized radar odometry (RO), to efficiently learn self-supervised navigation policies via RL. We evaluate our method on two large real-world datasets, Oxford Robotcar and Nordland Railway, over a range of weather (e.g. overcast, night, snow, sun, rain, clouds) and seasonal (e.g. winter, spring, fall, summer) conditions using the new CityLearn framework; an interactive environment for efficiently training navigation agents. Our experimental results, on traversals of the Oxford RobotCar dataset with no GPS data, show that MVP can achieve 53% and 93% navigation success rate using VO and RO, respectively, compared to 7% for a vision-only method. We additionally report a trade-off between the RL success rate and the motion estimation precision. : Under review at IROS 2020
format	Article in Journal/Newspaper
author	Chancán, Marvin Milford, Michael
author_facet	Chancán, Marvin Milford, Michael
author_sort	Chancán, Marvin
title	MVP: Unified Motion and Visual Self-Supervised Learning for Large-Scale Robotic Navigation
title_short	MVP: Unified Motion and Visual Self-Supervised Learning for Large-Scale Robotic Navigation
title_full	MVP: Unified Motion and Visual Self-Supervised Learning for Large-Scale Robotic Navigation
title_fullStr	MVP: Unified Motion and Visual Self-Supervised Learning for Large-Scale Robotic Navigation
title_full_unstemmed	MVP: Unified Motion and Visual Self-Supervised Learning for Large-Scale Robotic Navigation
title_sort	mvp: unified motion and visual self-supervised learning for large-scale robotic navigation
publisher	arXiv
publishDate	2020
url	https://dx.doi.org/10.48550/arxiv.2003.00667 https://arxiv.org/abs/2003.00667
genre	Nordland Nordland Nordland
genre_facet	Nordland Nordland Nordland
op_rights	Creative Commons Attribution Non Commercial Share Alike 4.0 International https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode cc-by-nc-sa-4.0
op_rightsnorm	CC-BY-NC-SA
op_doi	https://doi.org/10.48550/arxiv.2003.00667
_version_	1766115799929454592

MVP: Unified Motion and Visual Self-Supervised Learning for Large-Scale Robotic Navigation

Similar Items