Computer Laboratory › Rainbow Group

Rendering of Eyes for Eye-Shape Registration and Gaze Estimation

¹University of Cambridge, United Kingdom ²Max Planck Institute for Informatics

International Conference on Computer Vision 2015

We render photorealistic images of eyes for use as training data. We prepare our dynamic eye region models by retopologizing high-quality 3D head scans (left) and annotating them with landmark and gaze information (green).

Abstract

Images of the eye are key in several computer vision problems, such as shape registration and gaze estimation. Recent large-scale supervised methods for these problems require time-consuming data collection and manual annotation, which can be unreliable. We propose synthesizing perfectly labelled photo-realistic training data in a fraction of the time. We used computer graphics techniques to build a collection of dynamic eye-region models from head scan geometry. These were randomly posed to synthesize close-up eye images for a wide range of head poses, gaze directions, and illumination conditions. Finally, we demonstrate the benefits of our synthesized training data (SynthesEyes) by out-performing state-of-the-art methods for eye-shape registration as well as cross-dataset appearance-based gaze estimation in the wild.

Downloads

Paper (.pdf, 9.85 MB)
SynthesEyes Dataset (.zip, see description below, 291 MB)

Dataset

The dataset contains 11,382 synthesized close-up images of eyes. There are ten directories, one for each dynamic eye region model in our collection. Each eye image has associated data stored in a pickle file. The directory structure for the dataset is as follows:

SynthesEyes_dataset ├── f01 # data for f01 eye region model │ ├── f01_36_0.1963_-0.7854.png # 120x80px image │ ├── f01_36_0.1963_-0.7854.pkl # associated data for that image │ └── … ├── f02 … # data for f03, f04 … m03, m04 └── m05

The associated data for each image is a dict with keys:

look_vec – the 3D gaze direction in camera space.
head_pose – a 3x3 matrix rotation from world space to camera space.
ldmks – a dict containing the following 2D and 3D landmarks:

ldmks_lids_2d, ldmks_iris_2d, ldmks_pupil_2d in screen space.
ldmks_lids_3d, ldmks_iris_3d, ldmks_pupil_3d in camera space

Bibtex

@inproceedings{wood2015_iccv, title = {Rendering of Eyes for Eye-Shape Registration and Gaze Estimation}, author = {Erroll Wood and Tadas Baltrusaitis and Xucong Zhang and Yusuke Sugano and Peter Robinson and Andreas Bulling}, year = {2015}, date = {2015-12-12}, booktitle = {Proc. of the IEEE International Conference on Computer Vision (ICCV 2015)} }