skip to primary navigationskip to content

Department of Computer Science and Technology

Part II CST

 

Course pages 2022–23

Advanced Data Science

Principal lecturers: Prof Neil Lawrence, Dr Carl Henrik Ek
Taken by: Part II CST
Code: ADS
Term: Michaelmas
Hours: 16
Format: In-person lectures
Prerequisites: Data Science, Machine Learning and Real-world Data. NST Mathematics
Moodle, timetable

Practicals

The unit will have a practical session for each of the three stages of the pipeline. Each of the parts will be tied into an aspect of the final project.

Objectives

At the end of the course students will be familiar with the purpose of data science, how it differs from the closely related fields of machine learning, statistics and artificial intelligence and what a typical data analysis pipeline looks like in practice. As well as emphasising the importance of analysis methods we will introduce a formalism for organising how data science is done in practice and what the different aspects the data scientist faces when giving data-driven answers to questions of interest.

Recommended reading

  • Simon Rogers et al. (2016). A First Course in Machine Learning, Second Edition. [] Chapman and Hall/CRC, nil
  • Shai Shalev-Shwartz et al. (2014). Understanding Machine Learning: From Theory to Algorithms. New York, NY, USA: Cambridge  University Press
  • Christopher M. Bishop (2006). Pattern Recognition and Machine Learning (Information Science and Statistics). Secaucus, NJ, USA: Springer-Verlag New York, Inc.

Assessment

Practical There will be four practicals contributing 20% to the final module mark. These will be ‘ticked’
rather than graded: i.e., for each assignment, 100% of the mark is awarded for satisfactory completion
and 0% for inadequate work or failure to submit. Data science is a topic where there is rarely a single
correct answer therefore the important thing is that an informed attempt have been made to address
the questions that can be supported and motivated.
Report The course will be concluded by an individual report covering the material in the course through
practical exercise working with real data. This report will make up 80% of the mark for the course.