Modeling 3D Hand Interactions for Accurate Contact and Manipulation
Fecha
2024
Autores
Título de la revista
ISSN de la revista
Título del volumen
Editor
Universidad Rey Juan Carlos
Enlace externo
Resumen
Over the past few years Augmented Reality (AR) and Virtual Reality (VR) has
gained popularity and continues to grow in terms of interest and use, particularly
focusing on user interaction. These technologies transform how people engage
with digital content and immersive environments, generating considerable atten tion across sectors like entertainment, education, training and healthcare. The
need for natural interactions in VR and AR has emerged to enhance immersion,
accessibility and realism. The goal is to redefine human-computer interaction by
seamlessly blending virtual and physical worlds, offering varied engagement levels
from subtle virtual manipulation to full-body interactions within simulated envi ronments.
AR and VR rely on two major technology components typically addressed inde pendently: hand tracking and interactions involving both hand-object and hand to-hand scenarios. Existing methods often simplify these challenges, limiting their
real-world impact. For hand-object interaction, the most general approach involves
using physics simulation to enable hands and objects to interact according to the
laws of contact mechanics. However, differences in size and skeletal morphology
between hand representations in simulators and tracking devices complicate this
process. The first contribution of this thesis is a personalized soft-hand model
paired with a pose retargeting strategy, formulated as an optimization problem,
to connect tracked and simulated hand representations. This method integrates
off-the-shelf hand tracking solutions with physics-based hand simulation without
requiring a common hand representation, yet allows the hand model parametriza tion.
Hand-object interaction requires tracking the hand in the real world to map
our gestures to a virtual scenario. This hand tracking problem is a growing re search field with the potential to provide a natural interface for interacting with
virtual environments. Common solutions use computer vision methods, often cou pled with learning-based tracking algorithms, which can be depth-based or RGB based. These methods output the skeletal morphology and configuration of a hand
that best matches the user’s actual hand, with some also estimating the hand shape.
Given the ubiquity of RGB cameras, research has shifted towards RGB-based meth ods. Despite recent advances, the 3D tracking of two interacting hands from RGB
images remains challenging due to issues like inter-hand occlusion, depth ambigu ity, handness segmentation and collisions. Additionally, machine learning-based
approaches face difficulties in training due to the challenge of obtaining sufficient,
high-quality training data, further complicating the development of robust hand
tracking systems.
To address these challenges, we propose the first system that simulates physi cally correct two-hand interactions with personalized hand shape and diverse ap pearances which generates precise synthetic data. This framework is a major com ponent of a state-of-the-art algorithm for tracking two interacting hands from RGB
ii
images. Furthermore, we tackle depth errors that prevent accurate hand-to-hand
contact detection while tracking two-interacting hands by developing an image based data-driven approach formulated as an image-to-image translation problem.
To train our method, we introduce a new pipeline for automatically annotating
dense surface contacts in hand interaction sequences. Consequently, our method
estimates camera-space contacts during interactions, which can be plugged into
any two-hand tracking framework.
Descripción
Tesis Doctoral leída en la Universidad Rey Juan Carlos de Madrid en 2024.
Directores:
Dan Casas Guix
Palabras clave
Citación
Colecciones

Excepto si se señala otra cosa, la licencia del ítem se describe como Attribution-NonCommercial-NoDerivatives 4.0 International