Machine Learning and Real-world Data
Course Outline
The practical materials for the course can be found on the respective Moodle page. However, slides and various other materials are available via this site. pdf versions of all notes are available here, for ease of printing.
The course has 16 sessions. Each task has a short introductory lecture, followed by a two hour practical demonstration session in the Intel Lab. Most of the practical sessions are concerned with a task (with associated tick), but there are 4 sessions which are purely for catch up. Here are some general instructions on how to perform the tasks.
Topic One: Statistical classification (7 sessions)
- Session 1: Introduction to sentiment classification
- Slides
- Introduction to the course (also available in PDF)
- Session 2: Naive Bayes Classifier
- Slides
- Notes on Naive Bayes (also available in PDF)
- Session 3: Statistical laws of language
- Session 4: Statistical significance testing
- Slides
- Notes on significance
testing (also available
in PDF)
- Session 5: Overtraining and cross-validation
- Slides
- ML methodology: training, development and evaluation datasets (also available in PDF)
- Session 6: Uncertainty and human agreement
- Session 7: Catch up 1 : Quick introduction to some other
classifers
Topic Two: Hidden Markov Models (4 sessions)
- Session 8: Training the HMM
- Session 9: Viterbi algorithm
- Session 10: HMMs in a biological application
- Session 11: Catch up 2: Ethical Issues in Machine Learning
Topic Three: Social Networks (3 sessions)
- Session 12: Properties of Networks
- Session 13: Betweenness
- Session 14: Clustering
Soft Ticking Deadlines
- Ticks 1, 2 and 3: Friday 03/02
- Ticks 4, 5 and 6: Monday 13/02
- Ticks 7, 8 and 9: Monday 27/02
- Ticks 10, 11 and 12: Monday 13/3 (last session)
Examinability of material for MLRD
All material which is examinable can be found in the slides, in the ticks or in the additional notes. The slides and additional notes are also available from this page. Materials will become available shortly before each session.
In some cases, students are asked to read additional material, such as some parts of the Easley and Kleinberg textbook. This is explicitly noted on the slides; such concepts are examinable.
Of course, it may also be necessary for students to read additional material in order to fully understand the material presented: you should ask your supervisor for help if you think this applies to you and are uncertain about what to read.
We provide starred ticks for further understanding. For the exams, it is not necessary to do the starred ticks for the exams. However, looking at the material for the starred ticks may well help you understand material more thoroughly.
Material introduced in catch up session lectures is not examinable.
For fairness, the lecturers will not answer any individual questions about whether material is or is not examinable.
Supervision Questions
There are 4 question sets here.