DeepSeqSLAM: A Trainable CNN+RNN for Joint Global Description and Sequence-based Place Recognition

Sequence-based place recognition methods for all-weather navigation are well-known for producing state-of-the-art results under challenging day-night or summer-winter transitions. These systems, however, rely on complex handcrafted heuristics for sequential matching - which are applied on top of a p...

Full description

Bibliographic Details
Main Authors: Chancán, Marvin, Milford, Michael
Format: Article in Journal/Newspaper
Language:unknown
Published: arXiv 2020
Subjects:
Online Access:https://dx.doi.org/10.48550/arxiv.2011.08518
https://arxiv.org/abs/2011.08518
id ftdatacite:10.48550/arxiv.2011.08518
record_format openpolar
spelling ftdatacite:10.48550/arxiv.2011.08518 2023-05-15T17:24:36+02:00 DeepSeqSLAM: A Trainable CNN+RNN for Joint Global Description and Sequence-based Place Recognition Chancán, Marvin Milford, Michael 2020 https://dx.doi.org/10.48550/arxiv.2011.08518 https://arxiv.org/abs/2011.08518 unknown arXiv Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0 CC-BY Computer Vision and Pattern Recognition cs.CV Artificial Intelligence cs.AI Machine Learning cs.LG Robotics cs.RO FOS Computer and information sciences article-journal Article ScholarlyArticle Text 2020 ftdatacite https://doi.org/10.48550/arxiv.2011.08518 2022-03-10T15:09:53Z Sequence-based place recognition methods for all-weather navigation are well-known for producing state-of-the-art results under challenging day-night or summer-winter transitions. These systems, however, rely on complex handcrafted heuristics for sequential matching - which are applied on top of a pre-computed pairwise similarity matrix between reference and query image sequences of a single route - to further reduce false-positive rates compared to single-frame retrieval methods. As a result, performing multi-frame place recognition can be extremely slow for deployment on autonomous vehicles or evaluation on large datasets, and fail when using relatively short parameter values such as a sequence length of 2 frames. In this paper, we propose DeepSeqSLAM: a trainable CNN+RNN architecture for jointly learning visual and positional representations from a single monocular image sequence of a route. We demonstrate our approach on two large benchmark datasets, Nordland and Oxford RobotCar - recorded over 728 km and 10 km routes, respectively, each during 1 year with multiple seasons, weather, and lighting conditions. On Nordland, we compare our method to two state-of-the-art sequence-based methods across the entire route under summer-winter changes using a sequence length of 2 and show that our approach can get over 72% AUC compared to 27% AUC for Delta Descriptors and 2% AUC for SeqSLAM; while drastically reducing the deployment time from around 1 hour to 1 minute against both. The framework code and video are available at https://mchancan.github.io/deepseqslam : 9 pages, 6 figures, 2 tables Article in Journal/Newspaper Nordland Nordland Nordland DataCite Metadata Store (German National Library of Science and Technology)
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language unknown
topic Computer Vision and Pattern Recognition cs.CV
Artificial Intelligence cs.AI
Machine Learning cs.LG
Robotics cs.RO
FOS Computer and information sciences
spellingShingle Computer Vision and Pattern Recognition cs.CV
Artificial Intelligence cs.AI
Machine Learning cs.LG
Robotics cs.RO
FOS Computer and information sciences
Chancán, Marvin
Milford, Michael
DeepSeqSLAM: A Trainable CNN+RNN for Joint Global Description and Sequence-based Place Recognition
topic_facet Computer Vision and Pattern Recognition cs.CV
Artificial Intelligence cs.AI
Machine Learning cs.LG
Robotics cs.RO
FOS Computer and information sciences
description Sequence-based place recognition methods for all-weather navigation are well-known for producing state-of-the-art results under challenging day-night or summer-winter transitions. These systems, however, rely on complex handcrafted heuristics for sequential matching - which are applied on top of a pre-computed pairwise similarity matrix between reference and query image sequences of a single route - to further reduce false-positive rates compared to single-frame retrieval methods. As a result, performing multi-frame place recognition can be extremely slow for deployment on autonomous vehicles or evaluation on large datasets, and fail when using relatively short parameter values such as a sequence length of 2 frames. In this paper, we propose DeepSeqSLAM: a trainable CNN+RNN architecture for jointly learning visual and positional representations from a single monocular image sequence of a route. We demonstrate our approach on two large benchmark datasets, Nordland and Oxford RobotCar - recorded over 728 km and 10 km routes, respectively, each during 1 year with multiple seasons, weather, and lighting conditions. On Nordland, we compare our method to two state-of-the-art sequence-based methods across the entire route under summer-winter changes using a sequence length of 2 and show that our approach can get over 72% AUC compared to 27% AUC for Delta Descriptors and 2% AUC for SeqSLAM; while drastically reducing the deployment time from around 1 hour to 1 minute against both. The framework code and video are available at https://mchancan.github.io/deepseqslam : 9 pages, 6 figures, 2 tables
format Article in Journal/Newspaper
author Chancán, Marvin
Milford, Michael
author_facet Chancán, Marvin
Milford, Michael
author_sort Chancán, Marvin
title DeepSeqSLAM: A Trainable CNN+RNN for Joint Global Description and Sequence-based Place Recognition
title_short DeepSeqSLAM: A Trainable CNN+RNN for Joint Global Description and Sequence-based Place Recognition
title_full DeepSeqSLAM: A Trainable CNN+RNN for Joint Global Description and Sequence-based Place Recognition
title_fullStr DeepSeqSLAM: A Trainable CNN+RNN for Joint Global Description and Sequence-based Place Recognition
title_full_unstemmed DeepSeqSLAM: A Trainable CNN+RNN for Joint Global Description and Sequence-based Place Recognition
title_sort deepseqslam: a trainable cnn+rnn for joint global description and sequence-based place recognition
publisher arXiv
publishDate 2020
url https://dx.doi.org/10.48550/arxiv.2011.08518
https://arxiv.org/abs/2011.08518
genre Nordland
Nordland
Nordland
genre_facet Nordland
Nordland
Nordland
op_rights Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
cc-by-4.0
op_rightsnorm CC-BY
op_doi https://doi.org/10.48550/arxiv.2011.08518
_version_ 1766115702048030720