Simone Teufel

University of Cambridge Computer Laboratory
William Gates Building, JJ Thompson Ave,
CAMBRIDGE CB3 0FD, United Kingdom.
work: (+44) 1223 763643, fax: (+44) 1223 334678
email: sht25@cl.cam.ac.uk.

Research

My research is on summarization, IR, and on the use of NLP techniques for web-based applications. My PhD (Argumentative Zoning: Information Extraction from Scientific Articles) was concerned with robust rhetorical processing of scientific articles, in order to facilitate scientific-specific searches. The integration of citation indexing, IE techniques and more traditional summarization techniques results in "rhetorically zoned" articles. I am also interested in cognitive experiments to prove the use of this type of robust processing in a real user environment. Another ongoing interest is in evaluation of summarization systems (which is a hard problem plaguing the community), particulary task-based evaluation.

My current research plans involve robust generation from sentence fragments and the integration of IE, IR and citation indexing for the specific searches that a scientist might want to do on large repositories of scientific articles.

Projects

I am/was involved in the following projects at the Computer Lab:

My publications are online here.

Teaching

I have taught two halves of two courses this year, "Computing and the Web" and Internet Applications on the CSTIT MPhil course, which is run jointly by the Engineering Department and the Computer Laboratory.

Biography

I got my PhD from CogSci in 1999, where I was mostly concerned with summarization and the structure of scientific arguments (cf. Argumentative Zoning). I also worked at the HCRC Language Technology Group.

During a Postdoc at Columbia University (2000-2001), I worked on the Digital Libraries Project PERSIVAL whose aim it is to provide patient-specific access to large collections of scientific articles, amongst others. In a subpart of the project, we reranked the output of searches in the field of cardiology to those articles which are of relevance to one particular patient the cardiologist is currently considering. I also worked on the TIDES project on multilingual summarization at Columbia.

Previously, at IMS, University Stuttgart, where I got my first degree, I was involved in the work of the EAGLES corpus and lexicon standardisation group. Part of my work was a tagset conversion tool, another part the testing of the interaction of text type, used lexicon and automatic tagger on the results. I also spent some time at XRCE Xerox in Grenoble, working on the extraction of nominalizations and collocations.

3


Simone Teufel
Created: October 29, 2001