Department of Computer Science and Technology

Technical reports

Mind-reading machines: automated inference of complex mental states

Rana Ayman el Kaliouby

July 2005, 185 pages

This technical report is based on a dissertation submitted March 2005 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Newnham College.

DOI: 10.48456/tr-636

Abstract

People express their mental states all the time, even when interacting with machines. These mental states shape the decisions that we make, govern how we communicate with others, and affect our performance. The ability to attribute mental states to others from their behaviour, and to use that knowledge to guide one’s own actions and predict those of others is known as theory of mind or mind-reading.

The principal contribution of this dissertation is the real time inference of a wide range of mental states from head and facial displays in a video stream. In particular, the focus is on the inference of complex mental states: the affective and cognitive states of mind that are not part of the set of basic emotions. The automated mental state inference system is inspired by and draws on the fundamental role of mind-reading in communication and decision-making.

The dissertation describes the design, implementation and validation of a computational model of mind-reading. The design is based on the results of a number of experiments that I have undertaken to analyse the facial signals and dynamics of complex mental states. The resulting model is a multi-level probabilistic graphical model that represents the facial events in a raw video stream at different levels of spatial and temporal abstraction. Dynamic Bayesian Networks model observable head and facial displays, and corresponding hidden mental states over time.

The automated mind-reading system implements the model by combining top-down predictions of mental state models with bottom-up vision-based processing of the face. To support intelligent human-computer interaction, the system meets three important criteria. These are: full automation so that no manual preprocessing or segmentation is required, real time execution, and the categorization of mental states early enough after their onset to ensure that the resulting knowledge is current and useful.

The system is evaluated in terms of recognition accuracy, generalization and real time performance for six broad classes of complex mental states—agreeing, concentrating, disagreeing, interested, thinking and unsure, on two different corpora. The system successfully classifies and generalizes to new examples of these classes with an accuracy and speed that are comparable to that of human recognition.

The research I present here significantly advances the nascent ability of machines to infer cognitive-affective mental states in real time from nonverbal expressions of people. By developing a real time system for the inference of a wide range of mental states beyond the basic emotions, I have widened the scope of human-computer interaction scenarios in which this technology can be integrated. This is an important step towards building socially and emotionally intelligent machines.

Full text

PDF (24.3 MB)

BibTeX record

@TechReport{UCAM-CL-TR-636,
  author =	 {el Kaliouby, Rana Ayman},
  title = 	 {{Mind-reading machines: automated inference of complex
         	   mental states}},
  year = 	 2005,
  month = 	 jul,
  url = 	 {https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-636.pdf},
  institution =  {University of Cambridge, Computer Laboratory},
  doi = 	 {10.48456/tr-636},
  number = 	 {UCAM-CL-TR-636}
}