Department of Computer Science and Technology

Technical reports

Evaluating natural language processing systems

J.R. Galliers, K. Spärck Jones

February 1993, 187 pages

DOI: 10.48456/tr-291


This report presents a detailed analysis and review of NLP evaluation, in principle and in practice. Part 1 examines evaluation concepts and establishes a framework for NLP system evaluation. This makes use of experience in the related area of information retrieval and the analysis also refers to evaluation in speech processing. Part 2 surveys significant evaluation work done so far, for instance in machine translation, and discusses the particular problems of generic system evaluation. The conclusion is that evaluation strategies and techniques for NLP need much more development, in particular to take proper account of the influence of system tasks and settings. Part 3 develops a general approach to NLP evaluation, aimed at methodologically-sound strategies for test and evaluation motivated by comprehensive performance factor identification. The analysis throughout the report is supported by extensive illustrative examples.

Full text

PS (0.2 MB)

BibTeX record

  author =	 {Galliers, J.R. and Sp{\"a}rck Jones, K.},
  title = 	 {{Evaluating natural language processing systems}},
  year = 	 1993,
  month = 	 feb,
  url = 	 {},
  institution =  {University of Cambridge, Computer Laboratory},
  doi = 	 {10.48456/tr-291},
  number = 	 {UCAM-CL-TR-291}