scholarly journals Robust Discriminative Analysis Framework for Gaze and Headpose Estimation

2021 ◽  
Author(s):  
Salahaldeen Rabba

Head movements, combined with gaze, play a fundamental role in predicting a person’s action and intention. In non-constrained head movement settings, the process is complex, and performance can degrade significantly in the presence of variation in head-pose, gaze position, occlusion and ambient illumination. In this thesis, a framework is therefore proposed to fuse and combine head-pose and gaze information to obtain more robust and accurate gaze estimation. Specific contributions include: the development of a newly developed graph-based model for pupil localization and accurate estimation of the pupil center; the proposal of a novel iris region descriptor feature using quadtree decomposition, that works together with pupil localization for gaze estimation; the proposal of kernel-based extensions and enhancements to a fusion mechanism known as Discriminative Multiple Canonical Correlation Analysis (DMCCA) for fusing features (proposed and traditional) together, to generate a refined, high quality feature set for classification; and the newly developed methodology of head-pose features based on quadtree decompositions and geometrical moments, to better integrate roll, yaw, pitch and jawline into the overall estimation framework. The experimental results of the proposed framework demonstrate robustness against variations in illumination, occlusion, head-pose and is calibration free. The proposed framework was validated on several datasets and scored: 4.5° using MPII, 4.4° using Cave, 4.8° using EYEDIAP, 5.0° using ACS, 4.1° using OSLO and 4.5° using UULM datasets respectively.

2021 ◽  
Author(s):  
Salahaldeen Rabba

Head movements, combined with gaze, play a fundamental role in predicting a person’s action and intention. In non-constrained head movement settings, the process is complex, and performance can degrade significantly in the presence of variation in head-pose, gaze position, occlusion and ambient illumination. In this thesis, a framework is therefore proposed to fuse and combine head-pose and gaze information to obtain more robust and accurate gaze estimation. Specific contributions include: the development of a newly developed graph-based model for pupil localization and accurate estimation of the pupil center; the proposal of a novel iris region descriptor feature using quadtree decomposition, that works together with pupil localization for gaze estimation; the proposal of kernel-based extensions and enhancements to a fusion mechanism known as Discriminative Multiple Canonical Correlation Analysis (DMCCA) for fusing features (proposed and traditional) together, to generate a refined, high quality feature set for classification; and the newly developed methodology of head-pose features based on quadtree decompositions and geometrical moments, to better integrate roll, yaw, pitch and jawline into the overall estimation framework. The experimental results of the proposed framework demonstrate robustness against variations in illumination, occlusion, head-pose and is calibration free. The proposed framework was validated on several datasets and scored: 4.5° using MPII, 4.4° using Cave, 4.8° using EYEDIAP, 5.0° using ACS, 4.1° using OSLO and 4.5° using UULM datasets respectively.


2020 ◽  
Vol 14 (01) ◽  
pp. 107-135
Author(s):  
Salah Rabba ◽  
Matthew Kyan ◽  
Lei Gao ◽  
Azhar Quddus ◽  
Ali Shahidi Zandi ◽  
...  

There remain outstanding challenges for improving accuracy of multi-feature information for head-pose and gaze estimation. The proposed framework employs discriminative analysis for head-pose and gaze estimation using kernel discriminative multiple canonical correlation analysis (K-DMCCA). The feature extraction component of the framework includes spatial indexing, statistical and geometrical elements. Head-pose and gaze estimation is constructed by feature aggregation and transforming features into a higher dimensional space using K-DMCCA for accurate estimation. The two main contributions are: Enhancing fusion performance through the use of kernel-based DMCCA, and by introducing an improved iris region descriptor based on quadtree. The overall approach is also inclusive of statistical and geometrical indexing that are calibration free (does not require any subsequent adjustment). We validate the robustness of the proposed framework across a wide variety of datasets, which consist of different modalities (RGB and Depth), constraints (wide range of head-poses, not only frontal), quality (accurately labelled for validation), occlusion (due to glasses, hair bang, facial hair) and illumination. Our method achieved an accurate head-pose and gaze estimation of 4.8∘ using Cave, 4.6∘ using MPII, 5.1∘ using ACS, 5.9∘ using EYEDIAP, 4.3∘ using OSLO and 4.6∘ using UULM datasets.


Sensors ◽  
2020 ◽  
Vol 21 (1) ◽  
pp. 26
Author(s):  
David González-Ortega ◽  
Francisco Javier Díaz-Pernas ◽  
Mario Martínez-Zarzuela ◽  
Míriam Antón-Rodríguez

Driver’s gaze information can be crucial in driving research because of its relation to driver attention. Particularly, the inclusion of gaze data in driving simulators broadens the scope of research studies as they can relate drivers’ gaze patterns to their features and performance. In this paper, we present two gaze region estimation modules integrated in a driving simulator. One uses the 3D Kinect device and another uses the virtual reality Oculus Rift device. The modules are able to detect the region, out of seven in which the driving scene was divided, where a driver is gazing at in every route processed frame. Four methods were implemented and compared for gaze estimation, which learn the relation between gaze displacement and head movement. Two are simpler and based on points that try to capture this relation and two are based on classifiers such as MLP and SVM. Experiments were carried out with 12 users that drove on the same scenario twice, each one with a different visualization display, first with a big screen and later with Oculus Rift. On the whole, Oculus Rift outperformed Kinect as the best hardware for gaze estimation. The Oculus-based gaze region estimation method with the highest performance achieved an accuracy of 97.94%. The information provided by the Oculus Rift module enriches the driving simulator data and makes it possible a multimodal driving performance analysis apart from the immersion and realism obtained with the virtual reality experience provided by Oculus.


2012 ◽  
Vol 21 (2) ◽  
pp. 802-815 ◽  
Author(s):  
R. Valenti ◽  
N. Sebe ◽  
T. Gevers

2007 ◽  
Vol 97 (2) ◽  
pp. 1149-1162 ◽  
Author(s):  
Mario Prsa ◽  
Henrietta L. Galiana

Models of combined eye-head gaze shifts all aim to realistically simulate behaviorally observed movement dynamics. One of the most problematic features of such models is their inability to determine when a saccadic gaze shift should be initiated and when it should be ended. This is commonly referred to as the switching mechanism mediated by omni-directional pause neurons (OPNs) in the brain stem. Proposed switching strategies implemented in existing gaze control models all rely on a sensory error between instantaneous gaze position and the spatial target. Accordingly, gaze saccades are initiated after presentation of an eccentric visual target and subsequently terminated when an internal estimate of gaze position becomes nearly equal to that of the target. Based on behavioral observations, we demonstrate that such a switching mechanism is insufficient and is unable to explain certain types of movements. We propose an improved hypothesis for how the OPNs control gaze shifts based on a visual-vestibular interaction of signals known to be carried on anatomical projections to the OPN area. The approach is justified by the analysis of recorded gaze shifts interrupted by a head brake in animal subjects and is demonstrated by implementing the switching mechanism in an anatomically based gaze control model. Simulated performance reveals that a weighted sum of three signals: gaze motor error, head velocity, and eye velocity, hypothesized as inputs to OPNs, successfully reproduces diverse behaviorally observed eye-head movements that no other existing model can account for.


Vision ◽  
2018 ◽  
Vol 2 (3) ◽  
pp. 35 ◽  
Author(s):  
Braiden Brousseau ◽  
Jonathan Rose ◽  
Moshe Eizenman

The most accurate remote Point of Gaze (PoG) estimation methods that allow free head movements use infrared light sources and cameras together with gaze estimation models. Current gaze estimation models were developed for desktop eye-tracking systems and assume that the relative roll between the system and the subjects’ eyes (the ’R-Roll’) is roughly constant during use. This assumption is not true for hand-held mobile-device-based eye-tracking systems. We present an analysis that shows the accuracy of estimating the PoG on screens of hand-held mobile devices depends on the magnitude of the R-Roll angle and the angular offset between the visual and optical axes of the individual viewer. We also describe a new method to determine the PoG which compensates for the effects of R-Roll on the accuracy of the POG. Experimental results on a prototype infrared smartphone show that for an R-Roll angle of 90 ° , the new method achieves accuracy of approximately 1 ° , while a gaze estimation method that assumes that the R-Roll angle remains constant achieves an accuracy of 3.5 ° . The manner in which the experimental PoG estimation errors increase with the increase in the R-Roll angle was consistent with the analysis. The method presented in this paper can improve significantly the performance of eye-tracking systems on hand-held mobile-devices.


2018 ◽  
Vol 9 (1) ◽  
pp. 6-18 ◽  
Author(s):  
Dario Cazzato ◽  
Fabio Dominio ◽  
Roberto Manduchi ◽  
Silvia M. Castro

Abstract Automatic gaze estimation not based on commercial and expensive eye tracking hardware solutions can enable several applications in the fields of human computer interaction (HCI) and human behavior analysis. It is therefore not surprising that several related techniques and methods have been investigated in recent years. However, very few camera-based systems proposed in the literature are both real-time and robust. In this work, we propose a real-time user-calibration-free gaze estimation system that does not need person-dependent calibration, can deal with illumination changes and head pose variations, and can work with a wide range of distances from the camera. Our solution is based on a 3-D appearance-based method that processes the images from a built-in laptop camera. Real-time performance is obtained by combining head pose information with geometrical eye features to train a machine learning algorithm. Our method has been validated on a data set of images of users in natural environments, and shows promising results. The possibility of a real-time implementation, combined with the good quality of gaze tracking, make this system suitable for various HCI applications.


2018 ◽  
Vol 10 (5) ◽  
Author(s):  
Almoctar Hassoumi ◽  
Vsevolod Peysakhovich ◽  
Christophe Hurter

      In this paper, we investigate how visualization assets can support the qualitative evaluation of gaze estimation uncertainty. Although eye tracking data are commonly available, little has been done to visually investigate the uncertainty of recorded gaze information. This paper tries to fill this gap by using innovative uncertainty computation and visualization. Given a gaze processing pipeline, we estimate the location of this gaze position in the world camera. To do so we developed our own gaze data processing which give us access to every stage of the data transformation and thus the uncertainty computation. To validate our gaze estimation pipeline, we designed an experiment with 12 participants and showed that the correction methods we proposed reduced the Mean Angular Error by about 1.32 cm, aggregating all 12 participants’ results. The Mean Angular Error is 0.25° (SD=0.15°) after correction of the estimated gaze. Next, to support the qualitative assessment of this data, we provide a map which codes the actual uncertainty in the user point of view. 


Sign in / Sign up

Export Citation Format

Share Document