Scene Retrieval for Contextual Visual Mapping

Visual navigation localizes a query place image against a reference database of place images, also known as a `visual map'. Localization accuracy requirements for specific areas of the visual map, `scene classes', vary according to the context of the environment and task. State-of-the-art...

Full description

Bibliographic Details
Main Authors:	Smith, William H. B., Milford, Michael, McDonald-Maier, Klaus D., Ehsan, Shoaib
Format:	Article in Journal/Newspaper
Language:	unknown
Published:	arXiv 2021
Subjects:	Computer Vision and Pattern Recognition cs.CV Robotics cs.RO FOS Computer and information sciences Nordland
Online Access:	https://dx.doi.org/10.48550/arxiv.2102.12728 https://arxiv.org/abs/2102.12728

id	ftdatacite:10.48550/arxiv.2102.12728
record_format	openpolar
spelling	ftdatacite:10.48550/arxiv.2102.12728 2023-05-15T17:24:41+02:00 Scene Retrieval for Contextual Visual Mapping Smith, William H. B. Milford, Michael McDonald-Maier, Klaus D. Ehsan, Shoaib 2021 https://dx.doi.org/10.48550/arxiv.2102.12728 https://arxiv.org/abs/2102.12728 unknown arXiv Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0 CC-BY Computer Vision and Pattern Recognition cs.CV Robotics cs.RO FOS Computer and information sciences Article CreativeWork article Preprint 2021 ftdatacite https://doi.org/10.48550/arxiv.2102.12728 2022-03-10T14:54:21Z Visual navigation localizes a query place image against a reference database of place images, also known as a `visual map'. Localization accuracy requirements for specific areas of the visual map, `scene classes', vary according to the context of the environment and task. State-of-the-art visual mapping is unable to reflect these requirements by explicitly targetting scene classes for inclusion in the map. Four different scene classes, including pedestrian crossings and stations, are identified in each of the Nordland and St. Lucia datasets. Instead of re-training separate scene classifiers which struggle with these overlapping scene classes we make our first contribution: defining the problem of `scene retrieval'. Scene retrieval extends image retrieval to classification of scenes defined at test time by associating a single query image to reference images of scene classes. Our second contribution is a triplet-trained convolutional neural network (CNN) to address this problem which increases scene classification accuracy by up to 7% against state-of-the-art networks pre-trained for scene recognition. The second contribution is an algorithm `DMC' that combines our scene classification with distance and memorability for visual mapping. Our analysis shows that DMC includes 64% more images of our chosen scene classes in a visual map than just using distance interval mapping. State-of-the-art visual place descriptors AMOS-Net, Hybrid-Net and NetVLAD are finally used to show that DMC improves scene class localization accuracy by a mean of 3% and localization accuracy of the remaining map images by a mean of 10% across both datasets. : 8 page paper on visual place recogniton and scene classification Article in Journal/Newspaper Nordland Nordland Nordland DataCite Metadata Store (German National Library of Science and Technology)
institution	Open Polar
collection	DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id	ftdatacite
language	unknown
topic	Computer Vision and Pattern Recognition cs.CV Robotics cs.RO FOS Computer and information sciences
spellingShingle	Computer Vision and Pattern Recognition cs.CV Robotics cs.RO FOS Computer and information sciences Smith, William H. B. Milford, Michael McDonald-Maier, Klaus D. Ehsan, Shoaib Scene Retrieval for Contextual Visual Mapping
topic_facet	Computer Vision and Pattern Recognition cs.CV Robotics cs.RO FOS Computer and information sciences
description	Visual navigation localizes a query place image against a reference database of place images, also known as a `visual map'. Localization accuracy requirements for specific areas of the visual map, `scene classes', vary according to the context of the environment and task. State-of-the-art visual mapping is unable to reflect these requirements by explicitly targetting scene classes for inclusion in the map. Four different scene classes, including pedestrian crossings and stations, are identified in each of the Nordland and St. Lucia datasets. Instead of re-training separate scene classifiers which struggle with these overlapping scene classes we make our first contribution: defining the problem of `scene retrieval'. Scene retrieval extends image retrieval to classification of scenes defined at test time by associating a single query image to reference images of scene classes. Our second contribution is a triplet-trained convolutional neural network (CNN) to address this problem which increases scene classification accuracy by up to 7% against state-of-the-art networks pre-trained for scene recognition. The second contribution is an algorithm `DMC' that combines our scene classification with distance and memorability for visual mapping. Our analysis shows that DMC includes 64% more images of our chosen scene classes in a visual map than just using distance interval mapping. State-of-the-art visual place descriptors AMOS-Net, Hybrid-Net and NetVLAD are finally used to show that DMC improves scene class localization accuracy by a mean of 3% and localization accuracy of the remaining map images by a mean of 10% across both datasets. : 8 page paper on visual place recogniton and scene classification
format	Article in Journal/Newspaper
author	Smith, William H. B. Milford, Michael McDonald-Maier, Klaus D. Ehsan, Shoaib
author_facet	Smith, William H. B. Milford, Michael McDonald-Maier, Klaus D. Ehsan, Shoaib
author_sort	Smith, William H. B.
title	Scene Retrieval for Contextual Visual Mapping
title_short	Scene Retrieval for Contextual Visual Mapping
title_full	Scene Retrieval for Contextual Visual Mapping
title_fullStr	Scene Retrieval for Contextual Visual Mapping
title_full_unstemmed	Scene Retrieval for Contextual Visual Mapping
title_sort	scene retrieval for contextual visual mapping
publisher	arXiv
publishDate	2021
url	https://dx.doi.org/10.48550/arxiv.2102.12728 https://arxiv.org/abs/2102.12728
genre	Nordland Nordland Nordland
genre_facet	Nordland Nordland Nordland
op_rights	Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0
op_rightsnorm	CC-BY
op_doi	https://doi.org/10.48550/arxiv.2102.12728
_version_	1766115792196206592

Scene Retrieval for Contextual Visual Mapping

Similar Items