Department of Computer Science and Technology

Technical reports

Annotating errors and disfluencies in transcriptions of speech

Andrew Caines, Diane Nicholls, Paula Buttery

December 2017, 10 pages

Abstract

This document presents our guidelines for the annotation of errors and disfluencies in transcriptions of speech. There is a well-established precedent for annotating errors in written texts but the same is not true of speech transcriptions. We describe our coding scheme, discuss examples and difficult cases, and introduce new codes to deal with features characteristic of speech.

Full text

PDF (0.4 MB)

BibTeX record

@TechReport{UCAM-CL-TR-915,
  author =	 {Caines, Andrew and Nicholls, Diane and Buttery, Paula},
  title = 	 {{Annotating errors and disfluencies in transcriptions of
         	   speech}},
  year = 	 2017,
  month = 	 dec,
  url = 	 {https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-915.pdf},
  institution =  {University of Cambridge, Computer Laboratory},
  number = 	 {UCAM-CL-TR-915}
}