I am Reader in Natural Language Processing at the University of Cambridge Computer Laboratory. Previously I was a member of Faculty at the University of Oxford.
My research area is Natural Language Processing and Computational Linguistics. I develop probabilistic and data-driven models for the syntactic and semantic analysis of natural language.
Research Output
Publications:
- Complete list of publications
- Profile on Google Scholar
Software and Demos:
- C&C language processing tools (Demo)
- Yue Zhang's ZPar
- Frannie Chang's Linguistic Steganography demo
Recent seminars:
- A Mathematical Framework for a Distributional Compositional Model of Meaning (Stanford, May 13; Groningen, Apr 13; King's College London, Nov 12; Essex, Oct 12; Edinburgh, Feb 12; Potsdam, Dec 11) [pdf]
- Distributional and Compositional Models of Meaning for Natural Language (Ulster, Mar 13; Sheffield, Dec 12; Ulster, July 11; Oxford, Oct 10) [pdf]
- Linguistic Steganography: Information Hiding in Text (Edinburgh, May 12; Sheffield, Mar 11; Surrey, Nov 09) [pdf]
- Parsing Fast and Deep with a wide-coverage lexicalised-grammar parser (Trento, Oct 11; Amsterdam, Dec 10) [pdf]
- How to Give a Technical Presentation in Computer Science (Ulster, Mar 13) [pdf]
Collaborators
Current research students:
- Saad Aloteibi.
A user-centered approach to Information Retrieval
- Sandro Bauer. Information and knowledge extraction using structured knowledge bases
- Ching-Yun (Frannie) Chang. Transformations for Linguistic Steganography
- Douwe Kiela. Compositional distributional semantics
- Wenduan Xu. CCG parsing
Current research associates:
- Laura Rimell. Compositional distributional semantics
- Tamara Polajnar. Compositional distributional semantics
- Andreas Vlachos. Semantic Parsing on the SPACEBOOK project
Past research students (with thesis titles):
- James Smith (DPhil, Oxford, 2012). Example-Based Methods for Natural Language Processing - with applications to machine translation and preposition correction
- Yue Zhang (DPhil, Oxford, 2009). Discriminative Learning Approaches for the Statistical Processing of Chinese
- Brian Harrington (DPhil, Oxford, 2009). ASKNet: Automatically Creating Semantic Knowledge Networks from Natural Language Text
Grants
My research is funded by the European Research Council (ERC), the Engineering and Physical Sciences Research Council (EPSRC), the EU 7th Framework Programme (FP7), Google, and Microsoft.
- Distributional Compositional Semantics for Text Processing (DisCoTex). ERC Starting Grant (2012-2017)
- A Unified Model of Compositional and Distributional Semantics: Theory and Applications. EPSRC (2012-2015)
- Knowledge Discovery and Extraction from Large-Scale Entity-Relationship Networks. Microsoft Research PhD Scholarship Programme (2012-2015)
- SpaceBook - Spatial & Personal Adaptive Communication Environment. EU FP7 (2011-2014)
- Knowledge Extraction and Discovery from Large-Scale Entity-Relationship Graphs. Google Research Award (2011-2012)
- FAUST - Feedback for User Adaptive Statistical Translation. EU FP7 (2010-2013)
- Accurate and Efficient Parsing of Biomedical Text. EPSRC (2007-2010)
- Example-Based Methods for Natural Language Processing. EPSRC CASE studentship with Sharp Laboratories of Europe (2005-2009)
Teaching
At Cambridge I teach/have taught the following courses:
- Introduction to Natural Language Processing (MPhil ACS, Part III, Michaelmas 2012, 2011, 2010)
- Syntax and Semantics of Natural Language (MPhil ACS, Part III, Lent 2013, 2012, 2011)
- Statistical Machine Translation (MPhil ACS, Part III, Lent 2013, 2011)
- Machine Learning for Language Processing (MPhil ACS, Part III, Lent 2013)
- Programming in C and C++ (Part IB, Michaelmas 2011)
- Information Retrieval (Part II, 2009)
- Various text and language processing modules on the MPhil in Computer Speech, Text and Internet Technology (2009-10)
At Oxford I developed a popular MSc course on Information Retrieval and Statistical Text Processing, which ran for five years, as well as tutoring Keble College undergraduates across a range of computer science subjects. I also supervised 18 6-month MSc projects on a variety of topics in language processing and AI, and supervised a number of final-year undergraduate projects. In 2007 I was awarded an Oxford University Teaching Award.
Media
- Our ambiguous world of words (University of Cambridge Research Horizons, May 2013, pp.10-11)
- Edinburgh Computer Science Podcast (August 2012, talking about Linguistic Steganography and Natural Language Parsing)
- Breaking new ground in Natural Language Processing (Cambridge Language Sciences Initiative, May 2012)
- Quantum Links Let Computers Understand Language (New Scientist, issue 2790, December 2010)
Activities
- Member (ex officio) of the Executive Committee of the Association for Computational Linguistics (2013-2015)
- Chair/Chair-elect of the European Chapter of the Association for Computational Linguistics (EACL) (2011-2015)
- Program co-chair (with Sandra Carberry) for the 48th Annual Meeting of the ACL (ACL-10)
- Team Leader for the JHU Research Workshop on Large-Scale Syntactic Processing: Parsing the Web (2009)
- Area chair (Syntax and Parsing) for ACL-08, EMNLP-09, IJCNLP-11, and ACL-12
- Workshops co-chair for EACL-09
- Editorial Board member for Journal of Artificial Intelligence Research (2011-2014),
Computational Linguistics (2009-2012), Computer Speech and Language (2009-),
Journal of Natural Language Engineering (2004-2016)
Contact
University of Cambridge Computer Laboratory
William Gates Building, 15 JJ Thomson Avenue
Cambridge CB3 0FD, UK
stephen.clark@cl.cam.ac.uk
+44 (0)1223 763704
- © Stephen Clark. Last updated: May 2013.
