Computer Laboratory

Technical reports

Automatic resolution of linguistic ambiguities

Branimir Konstatinov Boguraev

222 pages

This technical report is based on a dissertation submitted August 1979 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Trinity College.

Abstract

The thesis describes the design, implementation and testing of a natural language analysis system capable of performing the task of generating paraphrases in a highly ambiguous environment. The emphasis is on incorporating strong semantic judgement in an augmented transition network grammar: the system provides a framework for examining the relationship between syntax and semantics in the process of text analysis, especially while treating the related phenomena of lexical and structural ambiguity. Word-sense selection is based on global analysis of context within a semantically well-formed unit, with primary emphasis on the verb choice. In building structures representing text meaning, the analyser relies not on screening through many alternative structures – intermediate, syntactic or partial semantic – but on dynamically constructing only the valid ones. The two tasks of sense selection and structure building are procedurally linked by the application of semantic routines derived from Y. Wilks’ preference semantics, which are invoked at certain well chosen points of the syntactic constituent analysis – this delimits the scope of their action and provides context for a particular disambiguation technique. The hierarchical process of sentence analysis is reflected in the hierarchical organisation of application of these semantic routines – this allows the efficient coordination of various disambiguation techniques, and the reduction of syntactic backtracking, non-determinism in the grammar, and semantic parallelism. The final result of the analysis process is a dependency structure providing a meaning representation of the input text with labelled components centred on the main verb element, each characterised in terms of semantic primitives and expressing both the meaning of a constituent and its function in the overall textual unit. The representation serves as an input to the generator, organised around the same underlying principle as the analyser – the verb is central to the clause. Currently the generator works in paraphrase mode, but is specifically designed so that with minimum effort and virtually no change in the program control structure and code it could be switched over to perform translation.

The thesis discusses the rationale for the approach adopted, comparing it with others, describes the system and its machine implementation, and presents experimental results.

Full text

PDF (10.9 MB)

BibTeX record

@TechReport{UCAM-CL-TR-11,
  author =	 {Boguraev, Branimir Konstatinov},
  title = 	 {{Automatic resolution of linguistic ambiguities}},
  url = 	 {http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-11.pdf},
  institution =  {University of Cambridge, Computer Laboratory},
  number = 	 {UCAM-CL-TR-11}
}