Computer Laboratory

Course pages 2016–17

Machine Learning and Real-world Data


Session 3 (Statistical Laws of Language), slide 11, title Vocabulary size:

In the original slides, the relationship between un (unique items in the vocabulary) and n (text size) was wrongly described as `exponential'. This expression was removed, as the formula (with 0 < b < 1) fully explains the type of relationship.

Other changes (May 27)

Course Outline

The main teaching for the course is being done via Moodle. However, various material is also available via this site. pdf versions of all notes are available here, for ease of printing. The notes and slides will be added incrementally. Some material is in draft form, as noted.

The course has 16 sessions. Each session will have a short introductory lecture, followed by a two hour practical session. For most of these practical sessions, there is a `task', but there are 4 sessions which are purely for catch up.

Students are expected to attend the practical sessions in the Intel lab, but may skip the catch up sessions if they have completed the work and obtained all their ticks up to that point.

The schedule may change in response to events, but any such changes will be announced on the Moodle forum.

Topic One: Statistical classification (7 sessions)

Topic Two: Hidden Markov Models (4 sessions)

Topic Three: Social Networks (4 sessions)