Department of Computer Science and Technology

Course pages 2019–20

Formal Models of Language

Principal lecturer: Prof Paula Buttery
Taken by: Part IB CST 75%, Part II CST 50%
Past exam questions

No. of lectures: 8
Suggested hours of supervisions: 2
Prerequisite courses: Discrete Maths; Compiler Construction;

Aims

This course studies formal models of language and considers how they might be relevant to the processing and acquisition of natural languages. The course will extend knowledge of formal language theory; introduce several new grammars; and use concepts from information theory to describe natural language.

Lectures

  • Natural language and the Chomsky hierarchy 1. Recap classes of language. Closure properties of language classes. Recap pumping lemma for regular languages. Discussion of relevance (or not) to natural languages (example embedded clauses in English).
  • Natural language and the Chomsky hierarchy 2. Pumping lemma for context free languages. Discussion of relevance (or not) to natural languages (example Swiss-German cross serial dependancies). Properties of minimally context sensitive languages. Introduction to tree adjoining grammars.
  • Language processing and context free grammar parsing 1. Recap of context free grammar parsing. Language processing predictions based on top down parsing models (example Yngve’s language processing predictions). Language processing predictions based on probabilistic parsing (example Halle’s language processing predictions).
  • Language processing and context free grammar parsing 2. Introduction to context free grammar equivalent dependancy grammars. Language processing predictions based on Shift-Reduce parsing (examples prosodic look-ahead parsers, Parsey McParseface).
  • Grammar induction of language classes. Introduction to grammar induction. Discussion of relevance (or not) to natural language acquisition. Gold’s theorem. Introduction to context free grammar equivalent categorial grammars and their learnable classes.
  • Natural language and information theory 1. Entropy and natural language typology. Uniform information density as a predictor for language processing.
  • Natural language and information theory 2. Noisy channel encoding as a model for spelling error, translation and language processing.
  • Vector space models and word vectors. Introduction to word vectors (example Word2Vec). Word vectors as predictors for semantic language processing.

Objectives

At the end of the course students should

  • understand how known natural languages relate to formal languages in the Chomsky hierarchy;
  • have knowledge of several context free grammars equivalents;
  • understand how we might make predictions about language processing and language acquisition from formal models;
  • know how to use information theoretic concepts to describe aspects of natural language.

Recommended reading

* Jurafsky, D. & Martin, J. (2008). Speech and language processing. Prentice Hall.
Manning, C.D. & Schutze, H. (1999) Foundations of statistical natural language processing. MIT Press.
Ruslan, M. (2003) The Oxford handbook of computational linguistics. Oxford University Press.
Clark, A., Fox, C. & Lappin, S. (2010) The handbook of computational linguistics and natural language processing. Wiley-Blackwell.
Kozen, D. (1997) Automata and computibility. Springer.