Department of Computer Science and Technology

Course pages 2019–20

Machine Learning for Language Processing

The assessment is primarily via the project (95%), with the remaining 5% for attendance at lecture sessions, reading of assigned material, and satisfactory contribution during lectures.

The project consists of picking a task/dataset (suggestions below), implementing an approach, comparing against the literature, and writing a report. The assessment will be via the report, which should be 5000 words maximum length. The code, while not evaluated per se, should also be made available (e.g. on github) together with instructions on how one can reproduce the results mentioned in the paper. We encourage you to follow the format of a recent CL conference, e.g. ACL 2020, which sometimes offer their templates ready for editing online.

Your report should address the following questions:

  • Introduction: What is the task and why is it important?
  • Literature review (minimum 3 papers)
    • Identify weaknesses / room for improvement
    • Motivate your approach
  • Detail your proposed approach
  • Experiments:
    • Proper experimental design
    • Train/dev/test
    • What is the error metric for your task?
    • How did you choose your hyperparameters?
    • Does your idea work as expected?
    • Error analysis/Plot learning curves
  • Conclusions: what have we learnt from your experiments that could inform future work

Datasets/tasks:

Deadline to submit: 14/1/2020, 4PM (moodle)