Parsing the Web @ JHU -- The coordination task



N-best version of the parser


In brief: Bibliography | Corpus | Bigram/trigram symmetry in POS tags (possym) | WordNet hypernyms (wntags) | Word similarity using WordNet (wncompsims) | Score parser conjunctions using WordNet similarity (scoreconjsim)


21 June 2009 -- Added scoreconjsim. The program attributes a score to the conjunctions found by the C&C parser using the output of compwnsims. An example of output is here.

20 June 2009 -- Added compwnsims. The program calculate similarities between each pair of nouns/verbs in the sentence using the output of wntags. Very basic implementation using a cosine function. An example of output is here.

17 June 2009 -- Added wntags. The program tags each noun and verb in a sentence with its WordNet hypernyms. All senses are collapsed. This implements the tagging done by Agarwal and Boggess (1992) for medical texts in a basic, but more general way. An example of output can be found here.

15 June 2009 -- Added possym. The program looks for symmetries in parts of speech on either side of a coordination. Both bigrams and trigrams are considered. This implements in a very basic way the search for syntactic symmetries in Agarwal and Boggess (1992). An example of output can be found here.

15 June 2009 -- Added corpus of coordinations. Limited to 30 words and less sentences containing `and' or `or'. Obtained from 5000 pages of Wikipedia data. The CCG parse is collapsed onto one line, with the original sentence afterwards. The corpus is here. There is also a version showing coordinations only, where I have started marking the sentences with an incorrect and/or parse.

15 June 2009 -- Added bibliography. This is still in the making.