Department of Computer Science and Technology

Technical reports

Analysis and inference for English

Arthur William Sebright Cater

September 1981, 223 pages

This technical report is based on a dissertation submitted September 1981 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Queens’ College.

DOI: 10.48456/tr-19

Abstract

AD-HAC is a computer program which understands stories. Its three principal components each deal with significant subareas of the overall language-processing task: it has a sentence analyser, which creates conceptual representations of the meanings of individual sentences; an inferencer, which assimilates these into the existing representation of a story, determining pronoun referents and answering questions as a byproduct of this activity; and a sentence generator, which produces english sentences conveying the meaning of conceptual representations. The research reported here has focussed on the analyser and the inferencer.

The analyser uses an ATN to identify low-level syntactic constituents, such as verb groups or prepositional phrases: ‘requests’ associated with words, particularly verbs, are then applied in a nondeterministic preference-directed framework, using the constituents as building blocks in the analysis of phrases, clauses and sentences: the requests fall into five distinct processing classes. The partial analyses which result from the application or non-application of particular requests are ordered by preference, and the most-preferred partial analysis is persued first, giving a predominantly left-to-right scan through the sentence. A surprising result is that the analyser performs better if it is permitted to keep only a small number of partial analyses.

The inferencer exploits the primitives of the conceptual representation language, using these as the main indicator of the appropriate set of inferences. The inferences are specified by means of inference networks associated with the conceptual primitives. Tests are applied to elementary propositions derived from input sentence analyses, and select paths through the networks where appropriate inferences are made. Inference networks are also associated with ‘functions’ of objects, permitting higher-level than can normally be made using the primitives alone: the resulting system offers a synthesis of low-level inference and script-like inference. The inferences made by the networks are also used to determine the referents of pronouns, and to provide the answers to questions: the program takes an identical approach to these two tasks.

The performance of the system is illustrated by reference to texts which have been successfully processed by AD-HAC.

Full text

PDF (17.7 MB)

BibTeX record

@TechReport{UCAM-CL-TR-19,
  author =	 {Cater, Arthur William Sebright},
  title = 	 {{Analysis and inference for English}},
  year = 	 1981,
  month = 	 sep,
  url = 	 {https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-19.pdf},
  institution =  {University of Cambridge, Computer Laboratory},
  doi = 	 {10.48456/tr-19},
  number = 	 {UCAM-CL-TR-19}
}