Silent screen star Florence Lawrence
displaying a range of emotions
People express their mental states, including emotions, thoughts, and desires, all the time through facial expressions, vocal nuances and gestures. This is true even when they are interacting with machines. Our mental states shape the decisions that we make, govern how we communicate with others, and affect our performance. The ability to attribute mental states to others from their behaviour, and to use that knowledge to guide our own actions and predict those of others is known as theory of mind or mind-reading. It has recently gained attention with the growing number of people with Autism Spectrum Conditions, who have difficulties mind-reading.
Existing human-computer interfaces are mind-blind — oblivious to the user’s mental states and intentions. A computer may wait indefinitely for input from a user who is no longer there, or decide to do irrelevant tasks while a user is frantically working towards an imminent deadline. As a result, existing computer technologies often frustrate the user, have little persuasive power and cannot initiate interactions with the user. Even if they do take the initiative, like the now retired Microsoft Paperclip, they are often misguided and irrelevant, and simply frustrate the user. With the increasing complexity of computer technologies and the ubiquity of mobile and wearable devices, there is a need for machines that are aware of the user’s mental state and that adaptively respond to these mental states.
A computational model of mind-reading
Processing stages in the mind-reading system
Drawing inspiration from psychology, computer vision and machine learning, our team in the Computer Laboratory at the University of Cambridge has developed mind-reading machines — computers that implement a computational model of mind-reading to infer mental states of people from their facial signals. The goal is to enhance human-computer interaction through empathic responses, to improve the productivity of the user and to enable applications to initiate interactions with and on behalf of the user, without waiting for explicit input from that user. There are difficult challenges:
- It involves uncertainty, since a person’s mental state can only be inferred indirectly by analyzing the behaviour of that person. Even people are not perfect at reading the minds of others.
- Automatic analysis of the face from video is still an area of active research in its own right.
- There is no ‘code-book’ to interpret facial expressions as corresponding mental states.
Using a digital video camera, the mind-reading computer system analyzes a person’s facial expressions in real time and infers that person’s underlying mental state, such as whether he or she is agreeing or disagreeing, interested or bored, thinking or confused. The system is informed by the latest developments in the theory of mind-reading by Professor Simon Baron-Cohen, who leads the Autism Research Centre at Cambridge.
Prior knowledge of how particular mental states are expressed in the face is combined with analysis of facial expressions and head gestures occurring in real time. The model represents these at different granularities, starting with face and head movements and building those in time and in space to form a clearer model of what mental state is being represented. Software from Nevenvision identifies 24 feature points on the face and tracks them in real time. Movement, shape and colour are then analyzed to identify gestures like a smile or eyebrows being raised. Combinations of these occurring over time indicate mental states. For example, a combination of a head nod, with a smile and eyebrows raised might mean interest. The relationship between observable head and facial displays and the corresponding hidden mental states over time is modelled using Dynamic Bayesian Networks.
Images from the Mind-reading DVD
The system was trained using 100 8-second video clips of actors expressing particular emotions from the Mind Reading DVD, an interactive computer-based guide to reading emotions. The resulting analysis is right 90% of the time when the clips are of actors and 65% of the time when shown video clips of non-actors. The system’s performance was as good as the top 6% of people in a panel of 20 who were asked to label the same set of videos.
Previous computer programs have detected the six basic emotional states of happiness, sadness, anger, fear, surprise and disgust. This system recognizes complex states that are more useful because they come up more frequently in interactions. However, they are also harder to detect because they are conveyed in a sequence of movements rather than a single expression. Most other systems assume a direct mapping between facial expressions and emotion, but our system interprets the facial and head gestures in the context of the person’s most recent mental state, so the same facial expression may imply different mental states in diffrent contexts.
Current projects and future work
Monitoring a car driver
The mind-reading computer system presents information about your mental state as easily as a keyboard and mouse present text and commands. Imagine a future where we are surrounded with mobile phones, cars and online services that can read our minds and react to our moods. How would that change our use of technology and our lives? We are working with a major car manufacturer to implement this system in cars to detect driver mental states such as drowsiness, distraction and anger.
Current projects in Cambridge are considering further inputs such as body posture and gestures to improve the inference. We can then use the same models to control the animation of cartoon avatars. We are also looking at the use of mind-reading to support on-line shopping and learning systems. The mind-reading computer system may also be used to monitor and suggest improvements in human-human interaction. The Affective Computing Group at the MIT Media Laboratory is developing an emotional-social intelligence prosthesis that explores new technologies to augment and improve people’s social interactions and communication skills.
We are also exploring the ethical implications and privacy issues raised by this research. Do we want machines to watch us and understand our emotions? Mind-reading machines will undoubtedly raise the complexity of human-computer interaction to include concepts such as exaggeration, disguise and deception that were previously limited to communications between people.
Further projects and links
- Demonstrations of the system with volunteers at the CVPR Conference in 2004
- Royal Society 2006 Summer Science Exhibition
- Video of the Royal Society Summer Science exhibit
- Autism Research Centre at the University of Cambridge
- The mind-reading DVD
- Affective computing group at MIT
- Commercial product based on the system
The project has attracted a lot of attention. It has been covered by four television channels, a dozen radio stations, and a variety of other media. Google tracked over 100 on-line reports across all continents including the following: