This paper proposes using virtual reality to enhance the perception of actions by distant users o... more This paper proposes using virtual reality to enhance the perception of actions by distant users on a shared application. Here, distance may refer either to space (eg in a remote synchronous collaboration) or time (eg during playback of recorded actions). Our approach consists in immersing the application in a virtual inhabited 3D space and mimicking user actions by animating avatars whose motion can be either generated from user actions on the shared application or from motion capture by a computer vision system ...
... 1 Gerard Chollet, Anna Esposito, Annie Gentes, Patrick Horain, Walid Karam, Zhenbo Li, Cather... more ... 1 Gerard Chollet, Anna Esposito, Annie Gentes, Patrick Horain, Walid Karam, Zhenbo Li, Catherine Pelachaud, Patrick Perrot, Dijana Petrovska-Delacretaz, Dianle Zhou, and Leila Zouari Speech through the Ear, the Eye, the Mouth and the Hand..... ...
Virtual worlds are developing rapidly over the internet. They are visited by avatars and staffed ... more Virtual worlds are developing rapidly over the internet. They are visited by avatars and staffed with Embodied Conversational Agents (ECAs). An avatar is a representation of a physical person. Each person controls one or several avatars and usually receives feedback from the virtual world on an audio-visual display. Ideally, all senses should be used to feel fully embedded in a virtual world. Sound, vision and sometimes touch are the available modalities. This paper reviews the technological developments which enable audio-visual interactions in virtual and augmented reality worlds. Emphasis is placed on the speech and gesture interfaces, talking face analysis and synthesis.
2013 IEEE International Conference on Computer Vision, 2013
Elongated objects have various shapes and can shift, rotate, change scale, and be rigid or deform... more Elongated objects have various shapes and can shift, rotate, change scale, and be rigid or deform by flexing, articulating, and vibrating, with examples as varied as a glass bottle, a robotic arm, a surgical suture, a finger pair, a tram, and a guitar string. This generally makes tracking of poses of elongated objects very challenging.
HAL - hal.archives-ouvertes.fr, CCSd - Centre pour la Communication Scientifique Direct. Accueil;... more HAL - hal.archives-ouvertes.fr, CCSd - Centre pour la Communication Scientifique Direct. Accueil; Dépôt: S'authentifier; S'inscrire. Consultation: Par domaine; Les 30 derniers dépôts; Par année de publication, rédaction, dépôt; Par type de publication; Par collection; Les portails de l'archive ouverte HAL; Par établissement (extraction automatique); ArXiv; Les Thèses (TEL). Recherche: Recherche simple; Recherche avancée; Accès par identifiant; Les Thèses ...
Computer Vision – ECCV 2012. Workshops and Demonstrations, Oct 2012
Avatars in networked 3D virtual environments allow users to interact over the Internet and to get... more Avatars in networked 3D virtual environments allow users to interact over the Internet and to get some feeling of virtual telepresence. However, avatar control may be tedious. Motion capture systems based on 3D sensors have recently reached the consumer market, but webcams and camera-phones are more widespread and cheaper. The proposed demonstration aims at animating a user's avatar from real time 3D motion capture by monoscopic computer vision, thus allowing virtual telepresence to anyone using a personal computer with a webcam or a camera-phone. This kind of immersion allows new gesture-based communication channels to be opened in a virtual inhabited 3D space.
MIRAGE '09 Proceedings of the 4th International Conference on Computer Vision/Computer Graphics CollaborationTechniques, 2009
3D human motion capture by real-time monocular vision without using markers can be achieved by re... more 3D human motion capture by real-time monocular vision without using markers can be achieved by registering a 3D articulated model on a video. Registration consists in iteratively optimizing the match between primitives extracted from the model and the images with respect to the model position and joint angles. We extend a previous color-based registration algorithm with a more precise edge-based registration step. We present an experimental analysis of the residual error vs. the computation time and we discuss the balance between both approaches.
ORASIS'09 - Congrès des jeunes chercheurs en vision par ordinateur, 2009
Nous nous intéressons à l'acquisition 3D des gestes humains par vision monoscopique en temps réel... more Nous nous intéressons à l'acquisition 3D des gestes humains par vision monoscopique en temps réel sans marqueurs. Notre approche procède par recalage d'un modèle 3D articulé du corps sur une séquence vidéo qui consiste à rechercher itérativement la position du modèle et les angles d'articulation qui maximisent la correspondance entre des caractéristiques du modèle 3D projeté et des primitives de l'image. Nous avons précédemment décrit une mise en oeuvre à la cadence vidéo d'un recalage initial sur les régions colorées suivi d'un recalage plus précis sur les contours [7]. Dans ce travail, nous comparons expérimentalement l'erreur résiduelle en fonction du temps de calcul pour chacune de ces primitives de recalage et nous proposons un compromis en fonction de la puissance de calcul disponible.
Virtual worlds are developing rapidly over the Internet. They are visited by avatars and staffed ... more Virtual worlds are developing rapidly over the Internet. They are visited by avatars and staffed with Embodied Conversational Agents (ECAs). An avatar is a representation of a physical person. Each person controls one or several avatars and usually receives feedback from the virtual world on an audio-visual display. Ideally, all senses should be used to feel fully embedded in a virtual world. Sound, vision and sometimes touch are the available modalities. This paper reviews the technological developments which enable audio-visual interactions in virtual and augmented reality worlds. Emphasis is placed on speech and gesture interfaces, including talking face analysis and synthesis.
Abstract: The invention relates to the field of image processing, and especially to a method for ... more Abstract: The invention relates to the field of image processing, and especially to a method for dematrixing an image formed by a stream of data acquired according to a specific pattern of colour components and comprising at least one over-sampled component organised according to at least one privileged direction.
Analyse de séquences non calibrées pour la reconstruction 3D de scène
Résumé/Abstract Notre objectif est de restituer une représentation 3D synthétisable d'une scène d... more Résumé/Abstract Notre objectif est de restituer une représentation 3D synthétisable d'une scène dynamique à partir d'une séquence d'images non calibrées de la scène. Celle-ci comprend quelques objets pour lesquels on dispose d'un modèle générique 3D déformable et qui sont placés dans un environement quelconque. Les zones fixes non connues peuvent être reconstruites par des techniques projectives. Nous présentons la phase d'appariement dense préalable à la reconstruction.
Abstract: The method involves rotating an oversampled component according to an angle correspondi... more Abstract: The method involves rotating an oversampled component according to an angle corresponding to a privileged direction, and re-sampling under-sampled components on pixels of the oversampled component. Chromatic components correspond to Red-Green-Blue format, and the oversampled component is the green component. The re-sampling of the under-sampled components is an interpolation of the under-sampled components.
Markovian approach for 3D reconstruction of vessels from two views
ABSTRACT A method to reconstruct vessel lumens, based on constrained reconstruction of serial cro... more ABSTRACT A method to reconstruct vessel lumens, based on constrained reconstruction of serial cross-sections from two digital angiographic projections, is proposed. Each crosssection is reconstructed by a binary matrix from its two densitometric data projections, with ambiguities on the reconstruction removed by a priori knowledge. A probabilistic approach in which properties of the expected solution are described through a Markov
Particle filtering is known as a robust approach for motion tracking by vision, at the cost of he... more Particle filtering is known as a robust approach for motion tracking by vision, at the cost of heavy computation in the high dimensional pose space. In this work, we describe a number of heuristics that we demonstrate to jointly improve robustness and real-time for motion capture. 3D human motion capture by monocular vision without markers can be achieved in real-time by registering a 3D articulated model on a video. First, we search the high-dimensional space of 3D poses by generating new hypotheses (or particles) with equivalent 2D projection by kinematic flipping. Second, we use a semi-deterministic particle prediction based on local optimization. Third, we deterministically resample the probability distribution for a more efficient selection of particles. Particles (or poses) are evaluated using a match cost function and penalized with a Gaussian probability pose distribution learned off-line. In order to achieve real-time, measurement step is parallelized on GPU using the OpenCL API. We present experimental results demonstrating robust real-time 3D motion capture with a consumer computer and webcam.
Uploads
Papers by Patrick Horain