Computer Laboratory

Course pages 2014–15

Lexical Semantics

Principal lecturer: Dr Simone Teufel
Taken by: MPhil ACS, Part III
Code: L114
Hours: 16 (8 × two-hour lecture sessions)
Prerequisites: L90 Overview of Natural Language Processing

Aims

This module provides an introduction to NLP research centered around lexical semantics (i.e., aspects of the meaning of words and relations between word meanings). Relevant phenomena are described theoretically, followed by a description of algorithms for determination of meaning and detection of text structure are presented. Special attention is given to adequate evaluation methods in each area. Applications are also discussed, where appropriate.

Note: students are expected to take L90 Overview to Natural Language Processing in parallel with this module.

Syllabus

  • Session 1: Background to lexical semantics and word senses. What is word meaning, and what are word senses? What does a lexicographer do? Psycholinguistic background, linguistic tests for ambiguity
  • Session 2: Word sense disambiguation. Supervised and unsupervised methods of determining the sense of a word. How can a computer learn when "bass" is a fish, and when it is a musical instrument? How does a piece of text "hang together" lexically? How can we segment a piece of text according to the topics it discusses?
  • Session 3: Lexical relations. Theory on lexical relations, word association norms. Ontologies and taxonomies. WordNet. Compound Nouns. Lexical Chains
  • Session 4: Distributional semantics and semantic spaces. How words can be represented "by the company they keep". The vector space model, and dimensionality reduction models. LSI, Topic models.
  • Session 5: Verb classes and clustering. Frame Semantics, Semantic Role Labelling, selectional preferences. In which respect can verb meanings be similar to each other (e.g., purchase, buy, sell, lend)? How can we represent these similarities?
  • Session 6: Figurative language. What are metaphors, metonymies and similes? How can a machine recognise and interpret figurative language?
  • Session 7: Antonymy and sentiment detection. What does it mean for a piece of text to display negative or positive sentiment, and how could it be automatically recognised? Which types of words have an "opposite" -- and what does that mean in each case?
  • Session 8: Applications based on lexical semantics. Automatic Thesaurus creation for Information retrieval. Application of sentiment detection for summarisation. Lexical chain-based applications.

Objectives

On completion of this module, students should:

  • understand (and be able to describe coherently) core aspects of word meaning such as synonymy, similarity, word senses;
  • have gained some intuition about these phenomena by experimentation with corpora of everyday language;
  • understand the principles behind automatic methods for representing important aspects of word meaning and for solving lexical semantic-type ambiguities;
  • appreciate how these algorithms relate to the rest of the field of natural language processing.

Coursework

Two small, non-assessed course works (weeks 2 and 5)

Practical work

A self-directed corpus study which will be part of the assessed coursework.

Assessment

An assessed report of the above-mentioned corpus study, drawing together various aspects of lexical semantics taught on the course.

Recommended reading

Cruse, A. (2000), Meaning in Language. Oxford University Press, chapters 5-9, 11
Cruse, A. (1986), Lexical Semantics. Cambridge University Press, chapter 7.
Jurafsky and Martin (2008), Speech and Language Processing, 2nd Edition, chapters 19+20