A visual questioning answering approach to enhance robot localization in indoor environments

dc.contributor.authorPeña-Narvaez, Juan Diego
dc.contributor.authorMartín, Francisco
dc.contributor.authorGuerrero Hernández, José Miguel
dc.contributor.authorPérez-Rodríguez, Rodrigo
dc.date.accessioned2023-12-19T11:31:21Z
dc.date.available2023-12-19T11:31:21Z
dc.date.issued2023-11-27
dc.descriptionThe usage of a visual large language model to localize a robot in an indoor environmentes
dc.description.abstractNavigating robots with precision in complex environments remains a significant challenge. In this article, we present an innovative approach to enhance robot localization in dynamic and intricate spaces like homes and offices. We leverage Visual Question Answering (VQA) techniques to integrate semantic insights into traditional mapping methods, formulating a novel position hypothesis generation to assist localization methods, while also addressing challenges related to mapping accuracy and localization reliability. Our methodology combines a probabilistic approach with the latest advances in Monte Carlo Localization methods and Visual Language models. The integration of our hypothesis generation mechanism results in more robust robot localization compared to existing approaches. Experimental validation demonstrates the effectiveness of our approach, surpassing state-of-the-art multi-hypothesis algorithms in both position estimation and particle quality. This highlights the potential for accurate self-localization, even in symmetric environments with large corridor spaces. Furthermore, our approach exhibits a high recovery rate from deliberate position alterations, showcasing its robustness. By merging visual sensing, semantic mapping, and advanced localization techniques, we open new horizons for robot navigation. Our work bridges the gap between visual perception, semantic understanding, and traditional mapping, enabling robots to interact with their environment through questions and enrich their map with valuable insights. The code for this project is available on GitHub "https://github.com/juandpenan/topology_nav_ros2"es
dc.identifier.citationPeña-Narvaez JD, Martín F, Guerrero JM and Pérez-Rodríguez R (2023) A visual questioning answering approach to enhance robot localization in indoor environments. Front. Neurorobot. 17:1290584. doi: 10.3389/fnbot.2023.1290584es
dc.identifier.doi10.3389/fnbot.2023.1290584es
dc.identifier.urihttps://hdl.handle.net/10115/27453
dc.language.isoenges
dc.publisherFrontiers in Neuroroboticses
dc.rightsAtribución 4.0 Internacional*
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectvisual question answeringes
dc.subjectrobot localizationes
dc.subjectrobot navigationes
dc.subjectsemantic mapes
dc.subjectrobot mappinges
dc.titleA visual questioning answering approach to enhance robot localization in indoor environmentses
dc.typeinfo:eu-repo/semantics/articlees

Archivos

Bloque original

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
fnbot-17-1290584 (1).pdf
Tamaño:
2.77 MB
Formato:
Adobe Portable Document Format
Descripción: