Shallow processing and automatic summarising: a first study

Philip Gladwin, Stephen Pulman, Karen Spärck Jones

May 1991, 65 pages

This report describes a study of ten simple texts, investigating various discourse phenomena to see how they might be exploited, in shallow text processing, for summarising purposes. The processing involved was a simulation of automatic analysis which is in principle within reach of the state of the art. Each text was treated by a version of Sidner’s focusing algorithm. The products of this were fed into subsidiary stages of analysis to provide an assessment of the activity of the various discourse entities within each text. A concurrent process examined the occurrence of orthographically identical noun phrase forms. Appendices give the ten texts, a complete specification of the version of the focusing algorithm in use, and the full experimental results. These suggest, especially when the brevity of the test texts is taken into account, that the type of information given by focusing has potential but limited value for summarising.

