Computer Laboratory

Awais Athar - Citation Context Corpus

The citation context corpus consists of the full text of 852 papers which cite the top 20 target papers in the citation sentiment corpus with the highest number of objective citations. The corpus contains 1,034 paper--reference pairs and 203,803 sentences from the ACL Anthology Network (AAN) corpus. I examined all 203,803 sentences manually and identified the sentences in the citation context. These context sentences contained formal or informal citations to the target paper and were assigned classes according to their sentiment (negative, positive, objective/neutral). The remaining sentences, i.e., those that did not refer to the target papers were classified as being excluded (x) from the context.

This data is presented as a set of HTML files where each file contains all papers in the AAN which cite a specific target paper. The file contains a table where each line corresponds to a citing paper, and each square in that line is a sentence in the citing paper. Rows are sorted by increasing publication date. The colour of each square represents the sentiment of the sentence. A legend of these colours is visible below the title. Hovering the mouse over the first square in the line displays the paper meta-data for that line. Hovering the mouse over any other square displays the text content of the corresponding sentence. Clicking on the check-boxes at the top hides or shows squares of the corresponding sentiment.

For further details/citation, please see the following papers.

@InProceedings{athar-teufel:2012:DSSD,
  author    = {Athar, Awais  and  Teufel, Simone},
  title     = {Detection of Implicit Citations for Sentiment Detection},
  booktitle = {Proceedings of the Workshop on Detecting Structure in Scholarly Discourse},
  month     = {July},
  year      = {2012},
  address   = {Jeju Island, Korea},
  publisher = {Association for Computational Linguistics},
  pages     = {18--26},
  url       = {http://www.aclweb.org/anthology/W12-4303}
}

@InProceedings{athar-teufel:2012:NAACL-HLT,
  author    = {Athar, Awais  and  Teufel, Simone},
  title     = {Context-Enhanced Citation Sentiment Detection},
  booktitle = {Proceedings of the 2012 Conference of the NAACL:HLT},
  month     = {June},
  year      = {2012},
  address   = {Montr\'{e}al, Canada},
  publisher = {Association for Computational Linguistics},
  pages     = {597--601},
  url       = {http://www.aclweb.org/anthology/N12-1073}
}