Databases
Principal lecturer: Dr David Greaves
Taken by: Part IA CST
Term: Michaelmas
Hours: 12 (8 lectures + 2 practical classes)
Format: In-person lectures
Suggested hours of supervisions: 3
Past exam questions, Moodle, timetable
Aims
This course introduces basic concepts for database systems as seen from the perspective of application designers. That is, the focus is on the abstractions supported by database management systems and not on how those abstractions are implemented.
The database world is currently undergoing swift and dramatic transformations largely driven by Internet-oriented applications and services. Today many more options are available to database application developers than in the past and so it is becoming increasingly difficult to sort fact from fiction. The course attempts to cut through the fog with a practical approach that emphasises engineering tradeoffs that underpin these recent developments and also guide our selection of “the right tool for the job.”
This course covers three approaches. First, the traditional mainstay of the database industry -- the relational approach -- is described with emphasis on eliminating logical redundancy in data. Then two representatives of recent trends are presented -- graph-oriented and document-oriented databases. The lectures are supported with two Help and Tick sessions, where students gain hands-on experience and guidance with the Assessed Exercises (ticks).
Lectures
- L1 Introduction. What is a database system? What is a data model? Typical DBMS configurations and variations (fast queries or reliable update). In-core, in secondary store or distributed.
- L2 Conceptual modelling. Entities, relations, E/R diagrams and implementation-independent modelling, weak entities, cardinality.
- L3+4 The relational database model. Implementing E/R models with relational tables. Relational algebra and SQL. Basic query primitives. Update anomalies caused by redundancy. Minimising redundancy with normalised schemas.
- L5 Transactions. On-Line Transaction Processing. On-line Analytical Processing. Reliability, throughput, normal forms, ACID, BASE, eventual consistency.
- L6 Documents and semi-structured data. The NoSQL, schema-free movement. XML/JSON. Key/value stores. Embracing data redundancy: representing data for fast, application-specific access. Path queries (if time permits).
- L7 Further SQL. Multisets, NULL values, aggregates, transitive closure, expressibility (nested queries and recursive SQL).
- L8 Graph databases. Optimised for processing enormous numbers of nodes and edges. Implementing E/R models in a graph-oriented database. Comparison of the presented models.
Objectives
At the end of the course students should
- be able to design entity-relationship diagrams to represent simple database application scenarios
- know how to convert entity-relationship diagrams to relational- and graph-oriented implementations
- understand the fundamental tradeoff between the ease of updating data and the response time of complex queries
- understand that no single data architecture can be used to meet all data management requirements
- be familiar with recent trends in the database area.
Recommended reading
Lemahieu, W., Broucke, S. van den and Baesens, B. (2018) Principles of database management. Cambridge University Press.