Natural Language Processing

Next: Neural Computing Up: Lent Term 2002: Part Previous: Comparative Architectures Contents

Natural Language Processing

Lecturer: Dr E.J. Briscoe (ejb@cl.cam.ac.uk)

No. of lectures: 8

Prerequisite courses: none, but Regular Languages and Finite Automata, Probability, Logic and Proof, and Artificial Intelligence cover relevant material

Aims

This course aims to introduce the fundamental techniques of natural language processing, to develop an understanding of the limits of those techniques and of current research issues, and to evaluate some current and potential applications.

Introduction. Brief history of NLP research, current applications, knowledge-based versus probabilistic approaches.
Words. Parts-of-speech, inflectional and derivational morphology, finite-state techniques.
Syntax. Generative grammar, constituency, context-free grammars, a simple unification-based grammar.
Parsing. (Non-)deterministic parsing, parsing complexity, parsing preferences (garden paths), shift-reduce and chart parsing.
Semantics. Truth-conditional semantics, compositionality, syntactically-driven semantics, scope ambiguities, intensionality.
Statistical parsing. Ambiguity and change/variation, probabilistic grammar, grammar induction, lexical approaches and sparse data.
Understanding (discourse). Theorem proving, speech acts, reference, resolving anaphora, discourse structure, abductive inference, and planning.
Applications of NLP. Machine translation, information retrieval/extraction, spoken language understanding.

Objectives

At the end of the course students should

be able to describe the architecture of and basic design for a generic NLP system ``shell'' for a central task, such as mapping text to appropriate logical representations
be able to discuss the current and likely future performance of several NLP applications, such as machine translation or information retrieval
be able to describe briefly a fundamental technique for processing language for each of the main subtasks, such as morphological analysis, syntactic parsing, etc.
understand how these techniques draw on and relate to other areas of (theoretical) computer science, such as formal language theory, formal semantics of programming languages, or theorem proving
be able to compare and evaluate several approaches to some subtasks, such as statistical versus knowledge-based part-of-speech disambiguation or anaphora resolution
recognise the major properties of natural languages and how these relate to and differ from those of artificial languages

Recommended books

Jurafsky, D. & Martin, J. (2000). Speech and Language Processing. Prentice-Hall.
Russell, S. & Norvig, P. (1995). Artificial Intelligence: A Modern Approach. Prentice-Hall. (Especially Chapter VII, but see III, IV and V for supporting material.)