Simone Teufel

University of Cambridge Computer Laboratory
William Gates Building, JJ Thompson Ave,
CAMBRIDGE CB3 0FD, United Kingdom.
work: (+44) 1223 763643, fax: (+44) 1223 334678
email: sht25@cl.cam.ac.uk.

Reader in Information and Language at the NLIP group at the Computer Laboratory of the University of Cambridge

Research

My area of research is text understanding. In particular, I develop models of discourse structure and argumentation in scientific text. Several applications could profit from an analysis of a text's logical structure -- for instance text summarization, scientific search engines, improved bibliometrics, detection of "hot ideas" in a scientific field, and tools for better academic writing. The discourse analysis I propose (called Argumentative Zoning or AZ) is based on the recognition of the following phenomena: sentiment expressed towards cited work, ownership of ideas, and speech acts which express rhetorical statements typical for scientific argumentation. Co-reference between entities mentioned in text, and coherence of text pieces also plays an important role in my model. I am also interested in cognitive experiments to prove the use of this type of robust processing in a real user environment, particularly in task-based evaluations.

Biography

My first degree in Computer Science is from the University of Stuttgart, more specifically from the Center for Computational Linguistics (IMS). At the IMS, I was involved in designing the STTS tagset for German corpora, and also was a member of the EAGLES corpus and lexicon standardisation group. I also spent some time at XRCE Xerox in Grenoble, working on the extraction of nominalizations and collocations.

I received my PhD in Cognitive Science from the School of Informatics at the University of Edinburgh in 2000. My PhD thesis (on Argumentative Zoning) is available here. During my PhD, I was also a member of the HCRC Language Technology Group.

During a Postdoc at Columbia University (2000-2001), I worked on the Digital Libraries Project PERSIVAL whose aim it is to provide patient-specific access to large collections of scientific articles, amongst others. In a subpart of the project, we reranked the output of searches in the field of cardiology to those articles which are of relevance to one particular patient the cardiologist is currently considering. I also worked on the TIDES project on multilingual summarization at Columbia.

I joined the NLIP group at the University of Cambridge in 2001 as a lecturer, and have been reader in information and language since 2010. Most of my funded research involves text understanding or text mining and search from scientific articles.

Projects

I am/was involved in the following research projects:

Corpora

I was involved in the creation of the following corpora (either in projects or with students), which are distributed here:

College

I am a Fellow of Computer Science at King's College.

Publications

My publications are online here.

Teaching

I am currently teaching the following courses: I have taught the following courses in earlier years:

PhD Students

MPhil projects

Here is a list of my previous project suggestions for 2012/3 and for 2013/4.

Past postdocs

Part II projects


Simone Teufel
Created: October 29, 2001