Computer Laboratory

Cam3D corpus of spontaneous complex mental states

Tadas Baltrušaitis, Marwa Mahmoud & Peter Robinson

Cam3D consists of 108 labelled videos of 12 mental states including spontaneous facial expressions and hand gestures. It was labelled using crowd-sourcing (inter-rater reliability Κ=0.45).

We used three different sensors for data collection: Microsoft Kinect sensors, HD cameras, and microphones in the HD cameras. After the initial data collection, the videos were segmented. Each segment showed a single event such as a change in facial expression, head and body posture movement or hand gesture. From videos with public consent, a total of 451 segments were collected. The mean duration is 6 seconds.

Labelling was based on context-free observer judgment. Public segments were labelled by community crowd-sourcing. Out of the 451 segmented videos we wanted to extract the ones that can reliably be described as belonging to one of the 24 emotion groups from the Baron-Cohen taxonomy. From the 2916 labels collected, 122 did not appear in the taxonomy so were not considered in the analysis. The remaining 2794 labels were grouped as belonging to one of the 24 groups plus agreement, disagreement, and neutral. To fi lter out non-emotional segments we chose only the videos that 60% or more of the raters agreed on. This resulted in 108 segments in total. The most common label given to a video segment was considered as the ground truth.

The data is categorized by the ground-truth label and divided into seven folders. For each video segment, we provide the colour video, camera parameters, colour images and their corresponding aligned depth images.

This labelled corpus is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.