Sorli, Suzanne2025-03-062025-03-062024https://hdl.handle.net/10115/79017Tesis Doctoral leída en la Universidad Rey Juan Carlos de Madrid en 2024. Directores: Dan Casas GuixOver the past few years Augmented Reality (AR) and Virtual Reality (VR) has gained popularity and continues to grow in terms of interest and use, particularly focusing on user interaction. These technologies transform how people engage with digital content and immersive environments, generating considerable atten tion across sectors like entertainment, education, training and healthcare. The need for natural interactions in VR and AR has emerged to enhance immersion, accessibility and realism. The goal is to redefine human-computer interaction by seamlessly blending virtual and physical worlds, offering varied engagement levels from subtle virtual manipulation to full-body interactions within simulated envi ronments. AR and VR rely on two major technology components typically addressed inde pendently: hand tracking and interactions involving both hand-object and hand to-hand scenarios. Existing methods often simplify these challenges, limiting their real-world impact. For hand-object interaction, the most general approach involves using physics simulation to enable hands and objects to interact according to the laws of contact mechanics. However, differences in size and skeletal morphology between hand representations in simulators and tracking devices complicate this process. The first contribution of this thesis is a personalized soft-hand model paired with a pose retargeting strategy, formulated as an optimization problem, to connect tracked and simulated hand representations. This method integrates off-the-shelf hand tracking solutions with physics-based hand simulation without requiring a common hand representation, yet allows the hand model parametriza tion. Hand-object interaction requires tracking the hand in the real world to map our gestures to a virtual scenario. This hand tracking problem is a growing re search field with the potential to provide a natural interface for interacting with virtual environments. Common solutions use computer vision methods, often cou pled with learning-based tracking algorithms, which can be depth-based or RGB based. These methods output the skeletal morphology and configuration of a hand that best matches the user’s actual hand, with some also estimating the hand shape. Given the ubiquity of RGB cameras, research has shifted towards RGB-based meth ods. Despite recent advances, the 3D tracking of two interacting hands from RGB images remains challenging due to issues like inter-hand occlusion, depth ambigu ity, handness segmentation and collisions. Additionally, machine learning-based approaches face difficulties in training due to the challenge of obtaining sufficient, high-quality training data, further complicating the development of robust hand tracking systems. To address these challenges, we propose the first system that simulates physi cally correct two-hand interactions with personalized hand shape and diverse ap pearances which generates precise synthetic data. This framework is a major com ponent of a state-of-the-art algorithm for tracking two interacting hands from RGB ii images. Furthermore, we tackle depth errors that prevent accurate hand-to-hand contact detection while tracking two-interacting hands by developing an image based data-driven approach formulated as an image-to-image translation problem. To train our method, we introduce a new pipeline for automatically annotating dense surface contacts in hand interaction sequences. Consequently, our method estimates camera-space contacts during interactions, which can be plugged into any two-hand tracking framework.enAttribution-NonCommercial-NoDerivatives 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-nd/4.0/Modeling 3DModeling 3D Hand Interactions for Accurate Contact and ManipulationThesisinfo:eu-repo/semantics/openAccess