Computer Laboratory Home Page Search A-Z Directory Help
University of Cambridge Home Computer Laboratory
Computer Science Tripos Syllabus - Natural Language Processing
Computer Laboratory > Computer Science Tripos Syllabus - Natural Language Processing

Natural Language Processing next up previous contents
Next: Numerical Analysis II Up: Lent Term 2005: Part Previous: Information Retrieval   Contents

Natural Language Processing

Lecturer: Dr A.A. Copestake

No. of lectures: 8

Prerequisite courses: none, but Regular Languages and Finite Automata, Probability, Logic and Proof, and Artificial Intelligence cover relevant material

This course is a prerequisite for Information Retrieval (Part II).


This course aims to introduce the fundamental techniques of natural language processing and to develop an understanding of the limits of those techniques. It aims to introduce some current research issues, and to evaluate some current and potential applications.


  • Introduction. Brief history of NLP research, current applications, generic NLP system architecture, knowledge-based versus probabilistic approaches.

  • Finite-state techniques. Inflectional and derivational morphology, finite-state automata in NLP, finite-state transducers.

  • Prediction and part-of-speech tagging. Corpora, simple N-grams, word prediction, stochastic tagging, evaluating system performance.

  • Parsing and generation. Generative grammar, context-free grammars, parsing and generation with context-free grammars, weights and probabilities.

  • Parsing with constraint-based grammars. Constraint-based grammar, unification.

  • Compositional and lexical semantics. Simple compositional semantics in constraint-based grammar. Semantic relations, WordNet, word senses, word sense disambiguation.

  • Discourse and dialogue. Anaphora resolution, discourse relations.

  • Applications. Machine translation, email response, spoken dialogue systems.


At the end of the course students should

  • be able to describe the architecture of and basic design for a generic NLP system ``shell''

  • be able to discuss the current and likely future performance of several NLP applications, such as machine translation and email response

  • be able to describe briefly a fundamental technique for processing language for several subtasks, such as morphological analysis, parsing, word sense disambiguation etc.

  • understand how these techniques draw on and relate to other areas of (theoretical) computer science, such as formal language theory, formal semantics of programming languages, or theorem proving

Recommended book

* Jurafsky, D. & Martin, J. (2000). Speech and language processing. Prentice-Hall.

For background reading, one of:

Pinker, S. (1994). The language instinct. Penguin.
Matthews, P. (2003). Linguistics: a very short introduction. OUP.

Although the NLP lectures don't assume any exposure to linguistics, the course will be easier to follow if students have some understanding of basic linguistic concepts.

For reference purposes:

The Internet Grammar of English,

next up previous contents
Next: Numerical Analysis II Up: Lent Term 2005: Part Previous: Information Retrieval   Contents
Christine Northeast
Wed Sep 8 11:57:14 BST 2004