Machine Learning and Real-world Data
MLRD Under Lockdown: Zoom Demonstrated Sessions
Please read the Instructions here.
Join Demonstrated sessions: Zoom link
Sign up for each MLRD Tick using this link.
Examinability of material for MLRD
All material which is examinable can be found in the slides, in the ticks or in the additional notes. This material is all linked from the Moodle site. The slides, practical notes and additional notes are also available from this page. Materials will become available shortly before each session.
In some cases, students are asked to read additional material, such as some parts of the Easley and Kleinberg textbook. This is explicitly noted on the slides where the concepts are examinable.
Of course, it may also be necessary for students to read additional material in order to fully understand the material presented: you should ask your supervisor for help if you think this applies to you and are uncertain about what to read.
It is not necessary to do the starred ticks for the exams. However, looking at the material for the starred ticks may well help you understand material more thoroughly.
Material introduced in catch up session lectures is not examinable.
For fairness, the lecturers will not answer any individual questions about whether material is or is not examinable.
Supervision Questions
There are 4 question sets here.
Course Outline
The main teaching for the course is being done via Moodle. However, various material is also available via this site. pdf versions of all notes are available here, for ease of printing.
The course has 16 sessions. Each session will have a short introductory lecture, followed by a two hour practical Zoom session. For most of these practical sessions, there is a `task' (with associated tick), but there are 4 sessions which are purely for catch up.
Students who cannot attend the demonstrated sessions (for instance due to Time Zone Differences) should contact the lecturer.
The schedule may change in response to events, but any such changes will be announced on the Moodle forum.
Topic One: Statistical classification (7 sessions)
- Session 1: Introduction to sentiment classification
- Slides
- Practical notes
- Practical notes (pdf)
- Introduction to the course
- Introduction to the course (pdf)
- ML methodology: training, development and evaluation datasets
- ML methodology: training, development and evaluation datasets (pdf)
- Session 2: Naive Bayes Classifier
- Session 3: Statistical laws of language
- Session 4: Statistical testing
- Session 5: Overtraining and cross-validation
- Session 6: Uncertainty and human agreement
- Session 7: Catch up 1 : Quick introduction to some other
classifers
Topic Two: Hidden Markov Models (4 sessions)
- Session 8: Training the HMM
ERRATA: voice-over in video contains an error at around minute 12. Incoming transition probabilities into a state do not add to one, only outgoing probabilities do. In matrix notation with our conventions: rows sum to 1, columns don't in the general case. Corrected video uploaded Wed 17/2. Slides were correct at all times. Sorry for confusion.
- Session 9: Viterbi algorithm
- Session 10: HMMs in a biological application
- Session 11: Catch up 2: Protein structure prediction
Topic Three: Social Networks (4 sessions)
- Session 12: Properties of Networks
- Session 13: Betweenness
- Session 14: Clustering