Prof. Ann Copestake
Dr. Aurelie Herbelot


Distributional semantics
for linguists

Course overview

Distributional semantics investigates meaning by looking at the contexts in which a term occurs. A wide range of techniques are under active investigation in computational linguistics and psycholinguistics. However, distributional approaches have had little impact within `mainstream' linguistics. In this course, we will address the question of what linguists could get out of current approaches. We will consider a range of phenomena in semantics and show how distributional techniques can be applied. We will describe some of the tools and corpora which are available, with the aim of encouraging participants with no experience of distributional semantics to try their own experiments. We will also give an introduction to our own work on relating distributional semantics to model-theoretic semantics.


Lecture 1
A brief historical overview of work on distributional semantics and outline of the structure of the course. Various distributional models are introduced. A short, step-by-step example to implementing basic distributional techniques is given, including a brief discussion of common issues.
Slides: Session 1a, Session 1b

Lecture 2
This lecture focuses on lexical semantics. We discuss similarity and synonymy, introducing in the process the main similarity metric used in distributional semantics. A review of the differences between distributional semantics and classical lexical semantics are presented. We show that notions of true synonymy, hyponymy and antonymy cannot be formally captured using standard distributional semantics techniques. We then give an overview of the issues related to polysemy and sense clustering, and introduce some applications of distributional techniques.
Slides: Session 2a, Session 2b

Lecture 3
This part of the course investigates the relationship between distributional techniques and classical model-theoretic semantics. We start by discussing composition and the various methods proposed in the literature for composing distributions. We cover the evaluation of such methods and highlight their shortcomings. Our own approach to combining formal and distributional semantics ('Lexicalised Compositionality') is then shortly introduced and evaluated in the light of the linguistic requirements identified so far in the course.
Slides: Session 3a, Session 3b

Lecture 4
The core of this lecture is the potential for relating the Generative Lexicon (GL) framework and distributional semantics. An introduction of GL theory is given, with emphasis on regular polysemy, logical metonymy and qualia structure. We discuss the achievements of GL, as well as some problematic aspects, and review related distributional work.
Slides: Session 4

Lecture 5
Our last lecture investigates the treament of closed class words in distributional semantics, with a focus on quantification. We argue that, although quantifiers cannot be given a direct distributional interpretation, their (often context-dependent) semantics can be retrieved from the distributional interpretation of their arguments.
Slides: Session 5