Examinando por Autor "Sorli, Suzanne"

Mostrando 1 - 3 de 3

Accurate hand contact detection from RGB images via image-to-image translation
(Elsevier, 2025-05) Sorli, Suzanne; Comino-Trinidad, Marc; Casas, Dan
Hand tracking is a growing research field that can potentially provide a natural interface to interact with virtual environments. However, despite the impressive recent advances, the 3D tracking of two interacting hands from RGB video remains an open problem. While current methods are able to infer the 3D pose of two hands in interaction reasonably, residual errors in depth, shape, and pose estimation prevent the accurate detection of hand-to-hand contact. To mitigate these errors, in this paper, we propose an image-based data-driven method to estimate the contact in hand-to-hand interactions. Our method is built on top of 3D hand trackers that predict the articulated pose of two hands, enriching them with camera-space probability maps of contact points. To train our method, we first feed motion capture data of interacting hands into a physics-based hand simulator, and compute dense 3D contact points. We then render such contact maps from various viewpoints and create a dataset of pairs of pixel-to-surface hand images and their corresponding contact labels. Finally, we train an image-to-image network that learns to translate pixel-to-surface correspondences to contact maps. At inference time, we estimate pixel-to-surface correspondences using state-of-the-art hand tracking and then use our network to predict accurate hand-to-hand contact. We qualitatively and quantitatively validate our method in real-world data and demonstrate that our contact predictions are more accurate than state-of-the-art hand-tracking methods.
Fine Virtual Manipulation with Hands of Different Sizes
(GMRV Publications, 2021) Sorli, Suzanne; Verschoor, Mickeal; Casas, Dan; Tajadura-Jiménez, Ana; Otaduy, Miguel A.
Natural interaction with virtual objects relies on two major technology components: hand tracking and hand-object physics simulation. There are functional solutions for these two components, but their hand representations may differ in size and skeletal morphology, hence making the connection non-trivial. In this paper, we introduce a pose retargeting strategy to connect the tracked and simulated hand representations, and we have formulated and solved this hand retargeting as an optimization problem. We have also carried out a user study that demonstrates the effectiveness of our approach to enable fine manipulations that are slow and awkward with na¨ıve approaches.
Modeling 3D Hand Interactions for Accurate Contact and Manipulation
(Universidad Rey Juan Carlos, 2024) Sorli, Suzanne
Over the past few years Augmented Reality (AR) and Virtual Reality (VR) has gained popularity and continues to grow in terms of interest and use, particularly focusing on user interaction. These technologies transform how people engage with digital content and immersive environments, generating considerable atten tion across sectors like entertainment, education, training and healthcare. The need for natural interactions in VR and AR has emerged to enhance immersion, accessibility and realism. The goal is to redefine human-computer interaction by seamlessly blending virtual and physical worlds, offering varied engagement levels from subtle virtual manipulation to full-body interactions within simulated envi ronments. AR and VR rely on two major technology components typically addressed inde pendently: hand tracking and interactions involving both hand-object and hand to-hand scenarios. Existing methods often simplify these challenges, limiting their real-world impact. For hand-object interaction, the most general approach involves using physics simulation to enable hands and objects to interact according to the laws of contact mechanics. However, differences in size and skeletal morphology between hand representations in simulators and tracking devices complicate this process. The first contribution of this thesis is a personalized soft-hand model paired with a pose retargeting strategy, formulated as an optimization problem, to connect tracked and simulated hand representations. This method integrates off-the-shelf hand tracking solutions with physics-based hand simulation without requiring a common hand representation, yet allows the hand model parametrization. Hand-object interaction requires tracking the hand in the real world to map our gestures to a virtual scenario. This hand tracking problem is a growing re search field with the potential to provide a natural interface for interacting with virtual environments. Common solutions use computer vision methods, often cou pled with learning-based tracking algorithms, which can be depth-based or RGB based. These methods output the skeletal morphology and configuration of a hand that best matches the user’s actual hand, with some also estimating the hand shape. Given the ubiquity of RGB cameras, research has shifted towards RGB-based meth ods. Despite recent advances, the 3D tracking of two interacting hands from RGB images remains challenging due to issues like inter-hand occlusion, depth ambigu ity, handness segmentation and collisions. Additionally, machine learning-based approaches face difficulties in training due to the challenge of obtaining sufficient, high-quality training data, further complicating the development of robust hand tracking systems. To address these challenges, we propose the first system that simulates physi cally correct two-hand interactions with personalized hand shape and diverse ap pearances which generates precise synthetic data. This framework is a major com ponent of a state-of-the-art algorithm for tracking two interacting hands from RGB images. Furthermore, we tackle depth errors that prevent accurate hand-to-hand contact detection while tracking two interacting hands by developing an image based data-driven approach formulated as an image-to-image translation problem. To train our method, we introduce a new pipeline for automatically annotating dense surface contacts in hand interaction sequences. Consequently, our method estimates camera-space contacts during interactions, which can be plugged into any two-hand tracking framework.