Computer Laboratory

Course pages 2017–18

Databases

Principal lecturer: Dr Timothy Griffin
Taken by: Part IA CST 75%, Part IB CST 50%
Past exam questions

No. of lectures and practical classes: 8 + 4
Suggested hours of supervisions: 3
Prerequisite courses: None

Aims

This course introduces basic concepts for database systems as seen from the perspective of application designers. That is, the focus is on the abstractions supported by database management systems and not on how those abstractions are implemented.

The database world is currently undergoing swift and dramatic transformations largely driven by Internet-oriented applications and services. Today many more options are available to database application developers than in the past and so it is becoming increasingly difficult to sort fact from fiction. The course attempts to cut through the fog with a practical approach that emphasises engineering tradeoffs that underpin these recent developments and also guide our selection of “the right tool for the job.”

This course covers three approaches. First, the traditional mainstay of the database industry -- the relational approach -- is described with emphasis on eliminating logical redundancy in data. Then two representatives of recent trends are presented -- graph-oriented and document-oriented databases. The lectures are tightly integrated with the associated practical sessions where students gain hands-on experience with all three of these approaches.

Lectures

  • Introduction. What is a database system? What is a data model? A central tradeoff in the choice of data representation: optimise for ease of updating or for fast query response. On-Line Transaction Processing (OLTP) versus On-line Analytical Processing (OLAP). Application independent versus application specific data representations. [1 lecture]

  • Conceptual modeling The Entity-Relationship (ER) approach as an implementation-independent technique for modeling data. [1 lecture]

  • The relational model Implementing ER models with relational tables. Relational algebra and SQL. Update anomalies caused by logical redundancy. Minimise logical redundancy with normalised data representation. Functional dependencies (FDs) as a formal means of investigating redundancy. What is transitive closure? Why SQL struggles with transitive closure. [2 lectures]

  • The graph-oriented model The NoSQL movement. Implementing ER models in a graph-oriented database. Graph databases: optimised for computing transitive closure. Path-oriented queries. [2 lectures]

  • The document-oriented model Semi-structured data (XML, JSON). Document-oriented databases. Embracing data redundancy: representing data for fast, application-specific, access. The CAP principle for distributed database relating Consistency, Availability, and Partition Tolerance. Integration of relational and document-oriented approaches. [2 lectures]

Objectives

At the end of the course students should

  • be able to design entity-relationship diagrams to represent simple database application scenarios

  • know how to convert entity-relationship diagrams to relational- and graph-oriented implementations

  • understand the fundamental tradeoff between the ease of updating data and the response time of complex queries

  • understand that no single data architecture can be used to meet all data management requirements

  • be familiar with recent trends in the database area.

Recommended reading

Ullman, J. & Widom, J. (1997) A first course in database systems. Prentice Hall.