Computer Laboratory

Course pages 2016–17

Discourse Processing

Reading List

Session 1: Introduction and Overview

Core Reading:

  • Stephen Pinker, The Sense of Style (2014). Chapter 5: Arcs of Coherence.
  • Jurafsky and Martin (2008), Speech and Language Processing, 2nd Edition, chapter 21.

Session 2: Topic Segmentation

Core reading:

Deep reading:

Session 3: Anaphora Resolution

Core Reading:

Deep Reading:

Session 4: Centering and Entity Coherence

Core Reading:

Deep Reading:

Session 5: Rhetorical Structure Theory

Core Reading:

Deep Reading:

Session 6: Annotation Methodology

Core Reading:

Deep Reading:

Session 7: Argumentative Zoning

Core Reading:

Deep Reading:

Session 8: Summarisation and Discourse Structure

Core Reading:

Deep Reading:

Annotation Task

Consider the following two texts:

Please do the following tasks with these texts:

WEEK 2 : Segment text into topics; give each segment a name (its topic). Please send me your analysis by email, and I will put them up on this website.

WEEK 3 : Perform anaphora resolution on text 2. Steps: a) identify the anaphor. b) Indicate which other referring expression it corefers with.

Out of all possible anaphors, please concentrate only on two types: a) pronouns b) definite noun phrases (noun phrases starting with "the")

WEEK 4 : Simulate the Centering algorithm on the first six consecutive sentences of the "invention of matches" text. You can use a "standard" definition of anaphora resolution: resolve all prounouns and definite noun phrases. Then contruct forward-looking and backward-looking centers for each sentence, and decide which type of shift was performed.

WEEK 5: Provide an RST analysis of the first six consecutive sentences of "motor car". (maybe easiest if you write on paper and scan it; alternately, here are instructions for how to run an RST-tree drawing tool that previous years' students used).

WEEK 8: Perform an Kintsch and van Dijk style proposition tree manipulation for the first five sentences of "invention of fire lighting". To do so, you first have to create propositions. Please use your own best guess of what a proposition could look like, using Kintsch and van Dijk's propositions in the article as your guide. You then have to draw trees representing which argument overlap is the best fit for each proposition. As there is no title, you can choose a usable proposition from the first sentence at random (verbal propositions often work best). You perform this sentence-by sentence, working propositions off in the order you created them. When all propositions for one sentence are worked off, you then apply the Leading-Edge forgetting rule and move onto the next sentence. The "summary" of that text are the propositions that were remembered in most memory cycles.

Your annotations:

Week 2 (Topic Segmentation):

Week 3 (Pronoun Resolution):

Week 4 (Centering):

Week 5 (RST):