CitRAZ: Rhetorical Citation Maps and Domain-Independent Argumentative Zoning

Dan Tidhar (until May 2006)

Simone Teufel
(Principal Investigator)

Advaith Siddharthan

Bill Hollingsworth

Anna Ritchie

This is a First Grant project is funded by EPSRC, grant no. GR/S27832/01. Runtime 02/2004--10/2006.

The objective of Component A is to demonstrate that AZ as an intermediate discourse analysis is feasible and useful for longer texts. If Argumentative Zoning is to be applied to longer texts with variant structure, different writing phenomena and possibly different rhetorical devices must be accounted for. Part of this component has resulted in an analysis and transformation of the ACL anthology (Hollingworth et al., 2005), particularly the journal Computational Linguistics part of the anthology. Cf. also our recent related work with IR applications of ACL citations (Ritchie et al, 2006).

The objective of Component B is to investigate the use of citation material selected using guidance from AZ in order to create more valuable document surrogates. Improvements of Citation Maps over Automatic Citation Indexers (such as Google Scholar and CiteSeer) is the distinction of contrastive statements, including direct comparisons and criticisms from continuative statements, where the cited work is declared as part of current paper's solution. For this work, automatic citation classification sits at the core of CitRAZ's objectives. Our published results in this work package (Teufel et al 2006a, b) have contributed a workable and consistent annotation scheme for citation classification, which can be summarised by the following table:

WeakWeakness of cited approach
CoCoGMContrast/Comparison in Goals or Methods(neutral)
CoCo-Author's work is stated to be superior to cited work
CoCoR0Contrast/Comparison in Results (neutral)
CoCoXYContrast between 2 cited methods
PBasAuthor uses cited work as basis or starting point
PUseAuthor uses tools/algorithms/data/definitions
PModiAuthor adapts or modifies tools/algorithms/data
PMotThis citation is positive about approach used or problem addressed (used to motivate work in current paper
PSimAuthor's work and cited work are similar
PSup Author's work and cited work are compatible/provide support for each other
NeutNeutral description of cited work, or not enough textual evidence for above categories, or unlisted/unknown citation function

Whereas Component A ports Argumentative Zoning to a new text type, Component C concerns the move to a different scientific domain, namely bioinformatics. We have chosen bioinformatics texts as we expect meta-discourse in the life sciences to be maximally different from computational linguistics. Meta-discourse can be expected to differ across scientific domains, due to differences in writing styles and conventions. We observed in previous work in the medical domain (Teufel et al, 2001), that there seems to be less overall meta-discourse in this domain, but also less variation. This component has resulted in research in meta-discourse discovery (Abdalla and Teufel, 2006).



