In 2021-2 and 2022-3 I was a co-lecturer of the Computer Science course ‘(Overview of) Natural Language Processing’ for undergraduates and postgraduates (course page).

In 2020-1 I organised the ‘Non-standard NLP’ topic for the ACS MPhil course on ‘Advanced topics in machine learning and natural language processing’ (R250; link to dept course page).

In 2019-20 I was the organiser of the ‘Non-standard NLP’ topic for R250 (course page).

In 2018-9 I was co-organiser of the ‘NLP & ML for Speech’ topic for R250 (course page), and supervised for the Formal Models of Language course in the Computer Science Tripos.

I previously coordinated the Computational Linguistics paper (Li18) for the Linguistics Tripos in 2016/7 and 2017/8. This involved organisation of the paper, lecturing, and supervision. I’ve also supervised Li3 (Language, Brain and Society), Li4 (History and Varieties of English), Li11 (Historical Linguistics), Li13 (History of English), and Li15 (Language Acquisition).

I’ve given research methods seminars to Linguistics MPhils on bibliography management and the use of corpora, and to the PhDs on reproducible research, data management, presenting data and writing abstracts. I’ve also given a research skills programme seminar in the Computer Laboratory on results presentation.


I’ve supervised several undergraduate and master’s dissertations in Linguistics and Computer Science & Technology (NLIP), and welcome discussion of new projects. Examples of recent projects include:

Computer Science 3rd year (Part II) undergraduates

  • Exploring the use of cognate relations to automatically create a lexical database of synonyms
  • Evaluating Features For Modelling Complex Word Identification
  • Computational Modelling of Child Language Acquisition
  • Simulating Natural Language Learning and Evolution
  • A Comparison of Statistical and Neural Techniques for Grammatical Error Correction

Computer Science 4th year / MPhil (ACS) students

  • Very Low-Resource Machine Translation with Reference Languages
  • Word segmentation and lexicon learning from child-directed speech using multiple cues
  • Deep learning for text-based mental health diagnosis
  • An Expectation-Maximisation Algorithm for Automated Cognate Detection
  • Automatically analyzing negative interactions and relationships between members of an underground forum

Linguistics undergraduates

  • A Cross-Linguistic Approach to the Syntactic Parsing of Gapping Constructions
  • The cross-linguistic generalizability of predicting word boundaries in child-directed speech using incremental phonotactic and lexical information
  • Moving From Grammaticality to Fluency in the Automatic Error Correction of Learner Texts
  • A study to evaluate machine translation of French and English social media posts
  • Investigating the effect of Adaptive Teaching on grammar learning

For 2022/3 I have placed some new project ideas for Computer Science Part II projects here and for ACS projects here. Please contact me if you wish to discuss project ideas of your own.

See also my page about using the HPC and unofficial general project advice which I’ve repeated to different students at different times, so perhaps it can be useful for you too.

Linguistics at Cambridge

I am temporary Director of Studies for Linguistics undergraduates at Sidney Sussex College. Please get in touch if you are interested in applying to study this subject at Cambridge.

Contact me: firstname.lastname @ cl.cam.ac.uk