Course pages 2016–17 (still under preparation!)
Introduction to Natural Language Syntax and Parsing
This module aims to provide a brief introduction to linguistics for computer scientists and then goes on to cover some of the core tasks in natural language processing (NLP), focussing on statistical parsing of sentences to yield syntactic and semantic representations. We will look at how to evaluate parsers and see how well state-of-the-art tools perform given current techniques.
- Linguistics for NLP - morphology, syntax, semantics, pragmatics (of English) [6 sessions, TB]
- Parsing - grammars, treebanks, representations and evaluation, statistical parse ranking [8 sessions, SC]
- Interpretation - compositional semantics and entailment, pragmatic inference [2 sessions, TB]
On completion of this module, students should:
- understand the basic properties of human languages and be familiar with descriptive and theoretical frameworks for handling these properties;
- understand the design of tools for NLP tasks such as parsing and be able to apply them to text and evaluate their performance;
- understand some of the basic principles of the representation of linguistic meaning and interpretative inference.
- Week 6: Download and apply a PSG-based parser to a designated text. Evaluate the performance of the tools quantitatively and qualitatively.
- Week 8: Download and apply a CCG-based parser to a designated text. Evaluate the performance of the tools quantitatively and qualitatively.
- There will be three ticked, short, take-home assignments on linguistic analysis during weeks 2-4 and one on semantic interpretation in week 7. Each assignment is worth 5% of the final mark.
- An assessed practical report based on the practicals described above. The practical report will consist of a description of the work done of not more than 5000 words. It will contribute 80% of the final mark.
Bos, J. & Blackburn, P. Representation and Inference for Natural Language and Working with Discourse Representation Theory. Available here
Clark, S. & Curran, J.R. (2007). Wide-coverage efficient statistical parsing with CCG and log-linear models. In Computational Linguistics, 33(4), pp.493-552.
Jurafsky, D. & Martin, J. (2008). Speech and Language Processing. Prentice-Hall (2nd ed.).