Technical reports
Syntactic simplification and text cohesion
Advaith Siddharthan
August 2004, 195 pages
This technical report is based on a dissertation submitted November 2003 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Gonville and Caius College.
DOI: 10.48456/tr-597
Abstract
Syntactic simplification is the process of reducing the grammatical complexity of a text, while retaining its information content and meaning. The aim of syntactic simplification is to make text easier to comprehend for human readers, or process by programs. In this thesis, I describe how syntactic simplification can be achieved using shallow robust analysis, a small set of hand-crafted simplification rules and a detailed analysis of the discourse-level aspects of syntactically rewriting text. I offer a treatment of relative clauses, apposition, coordination and subordination.
I present novel techniques for relative clause and appositive attachment. I argue that these attachment decisions are not purely syntactic. My approaches rely on a shallow discourse model and on animacy information obtained from a lexical knowledge base. I also show how clause and appositive boundaries can be determined reliably using a decision procedure based on local context, represented by part-of-speech tags and noun chunks.
I then formalise the interactions that take place between syntax and discourse during the simplification process. This is important because the usefulness of syntactic simplification in making a text accessible to a wider audience can be undermined if the rewritten text lacks cohesion. I describe how various generation issues like sentence ordering, cue-word selection, referring-expression generation, determiner choice and pronominal use can be resolved so as to preserve conjunctive and anaphoric cohesive-relations during syntactic simplification.
In order to perform syntactic simplification, I have had to address various natural language processing problems, including clause and appositive identification and attachment, pronoun resolution and referring-expression generation. I evaluate my approaches to solving each problem individually, and also present a holistic evaluation of my syntactic simplification system.
Full text
PDF (1.3 MB)
BibTeX record
@TechReport{UCAM-CL-TR-597, author = {Siddharthan, Advaith}, title = {{Syntactic simplification and text cohesion}}, year = 2004, month = aug, url = {https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-597.pdf}, institution = {University of Cambridge, Computer Laboratory}, doi = {10.48456/tr-597}, number = {UCAM-CL-TR-597} }