*Lecturer: Dr E.J. Briscoe*
(`ejb@cl.cam.ac.uk`)

*No. of lectures:* 8

*Prerequisite courses: none, but Regular Languages and Finite
Automata, Probability, Logic and Proof, and Artificial Intelligence
cover relevant material*

**Aims**

This course aims to introduce the fundamental techniques of natural language processing, to develop an understanding of the limits of those techniques and of current research issues, and to evaluate some current and potential applications.

**Introduction.**Brief history of NLP research, current applications, knowledge-based*versus*probabilistic approaches.**Words.**Parts-of-speech, inflectional and derivational morphology, finite-state techniques.**Syntax.**Generative grammar, constituency, context-free grammars, a simple unification-based grammar.**Parsing.**(Non-)deterministic parsing, parsing complexity, parsing preferences (garden paths), shift-reduce and chart parsing.**Semantics.**Truth-conditional semantics, compositionality, syntactically-driven semantics, scope ambiguities, intensionality.**Statistical parsing.**Ambiguity and change/variation, probabilistic grammar, grammar induction, lexical approaches and sparse data.**Understanding (discourse).**Theorem proving, speech acts, reference, resolving anaphora, discourse structure, abductive inference, and planning.**Applications of NLP.**Machine translation, information retrieval/extraction, spoken language understanding.

**Objectives**

At the end of the course students should

- be able to describe the architecture of and basic design for a
generic NLP system ``shell'' for a central task, such as mapping
text to appropriate logical representations
- be able to discuss the current and likely future performance of
several NLP applications, such as machine translation or
information retrieval
- be able to describe briefly a fundamental technique for
processing language for each of the main subtasks, such as
morphological analysis, syntactic parsing, etc.
- understand how these techniques draw on and relate to other
areas of (theoretical) computer science, such as formal language
theory, formal semantics of programming languages, or theorem
proving
- be able to compare and evaluate several approaches to some
subtasks, such as statistical
*versus*knowledge-based part-of-speech disambiguation or anaphora resolution - recognise the major properties of natural languages and how
these relate to and differ from those of artificial languages

**Recommended books**

Jurafsky, D. & Martin, J. (2000). *Speech and Language
Processing*. Prentice-Hall.

Russell, S. & Norvig, P. (1995). *Artificial Intelligence: A
Modern Approach*. Prentice-Hall. (Especially Chapter VII, but see
III, IV and V for supporting material.)

Recommended background reading:

Pinker, S. (1994). *The Language Instinct*. Penguin.