Course pages 2018–19
Data Science: principles and practice
Principal lecturers: Dr Marek Rei, Dr Ekaterina Kochmar, Dr Damon Wischik, Prof Ted Briscoe
Taken by: Part II CST 75%
No. of lectures and practical classes: 12
Prerequisite courses: NST Mathematics, Machine Learning and Real-World Data and Foundations of Data Science.
Capacity: 40-50
Aims
The course will develop core areas of Data Science (eg. models for regression and classification) from several perspectives: conceptual formulation and properties, solution algorithms and their implementation, data visualization for exploratory data analysis and the effective presentation of modelling outputs. The lectures will be complemented by practical classes using Python, scikit-learn and TensorFlow.
Lectures
- Introduction. Motivation, applications, examples, loading common
data formats, calculating statistics over a dataset, logistics and overview
of the course.
- Linear Regression. Defining a model, fitting a model, least squares
regression, linear regression, gradient descent, scikit-learn.
- Practical: Linear Regression
- Classification. Classification, perceptron, logistic regression,
multi-class classification, regularisation, kernels, exploratory data
visualisation.
- Practical: Classification
- Deep Learning, part I. Training neural networks, applications,
multilayer perceptrons, stochastic gradient descent, backpropagation.
- Deep Learning, part II. Advanced architectures, convnets, RNNs,
introduction to TensorFlow.
- Practical: Deep Learning
- Visualization, part I. Scales and coordinates, depicting
comparisons.
- Visualization, part II. Common plotting patterns, including
dimension reduction.
- Practical: Visualization
- Challenges in Data Science. Summary of the course, overview of other relevant techniques, ethics and privacy issues, bias in the training data, information about the hand out test.
Objectives
By the end of the course students should be able to:
- demonstrate understanding and practical skills in Data Science;
- be able to specify and work with an analytical model;
- be able to effectively implement Data Science algorithms;
- understand how data visualization underpins exploring datasets as well as communicating the findings of data science models.
Recommended reading
Bishop, C.M. (2008). Pattern Recognition and Machine Learning. Springer.
MacKay, D.J. (2003). Information Theory, Inference and Learning
Algorithms. Cambridge University Press.
Python Basic Tutorial. Available online:
https://www.tutorialspoint.com/python/index.htm
Numpy: Quickstart Tutorial. Available online:
https://docs.scipy.org/doc/numpy/user/quickstart.html
Get Started with TensorFlow. Available online:
https://www.tensorflow.org/tutorials/