Department of Computer Science and Technology

Technical reports

Analysis of affective expression in speech

Tal Sobol-Shikler

January 2009, 163 pages

This technical report is based on a dissertation submitted March 2007 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Girton College.

Many of the features and techniques presented in this report are subject to patent applications.

DOI: 10.48456/tr-740

Abstract

This dissertation presents analysis of expressions in speech. It describes a novel framework for dynamic recognition of acted and naturally evoked expressions and its application to expression mapping and to multi-modal analysis of human-computer interactions.

The focus of this research is on analysis of a wide range of emotions and mental states from non-verbal expressions in speech. In particular, on inference of complex mental states, beyond the set of basic emotions, including naturally evoked subtle expressions and mixtures of expressions.

This dissertation describes a bottom-up computational model for processing of speech signals. It combines the application of signal processing, machine learning and voting methods with novel approaches to the design, implementation and validation. It is based on a comprehensive framework that includes all the development stages of a system. The model represents paralinguistic speech events using temporal abstractions borrowed from various disciplines such as musicology, engineering and linguistics. The model consists of a flexible and expandable architecture. The validation of the model extends its scope to different expressions, languages, backgrounds, contexts and applications.

The work adapts an approach that an utterance is not an isolated entity but rather a part of an interaction and should be analysed in this context. The analysis in context includes relations to events and other behavioural cues. Expressions of mental states are related not only in time but also by their meaning and content. This work demonstrates the relations between the lexical definitions of mental states, taxonomies and theoretical conceptualization of mental states and their vocal correlates. It examines taxonomies and theoretical conceptualisation of mental states in relation to their vocal characteristics. The results show that a very wide range of mental state concepts can be mapped, or described, using a high-level abstraction in the form of a small sub-set of concepts which are characterised by their vocal correlates.

This research is an important step towards comprehensive solutions that incorporate social intelligence cues for a wide variety of applications and for multi-disciplinary research.

Full text

PDF (2.7 MB)

BibTeX record

@TechReport{UCAM-CL-TR-740,
  author =	 {Sobol-Shikler, Tal},
  title = 	 {{Analysis of affective expression in speech}},
  year = 	 2009,
  month = 	 jan,
  url = 	 {https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-740.pdf},
  institution =  {University of Cambridge, Computer Laboratory},
  doi = 	 {10.48456/tr-740},
  number = 	 {UCAM-CL-TR-740}
}