Computer Laboratory

Course material 2010–11


Databases

Lecturer: Dr T.G. Griffin

No. of lectures: 12

Aims

The overall aim of the course is to cover the fundamentals of database management systems (DBMSs), paying particular attention to relational database systems. The course covers modelling techniques, transferring designs to actual database implementations, SQL, models of query languages, transactions as well as more recent developments, including data warehouses and On-line Analytical Processing (OLAP), and use of XML as a data exchange language. The lectures will make use of the open source DBMS, MySQL.

Lectures

  • Introduction. What is a database system? Database systems are more than just a collection of data. Three level architecture. OnLine Transaction Processing (OLTP) versus OnLine Analytic Processing (OLAP).

  • Entity-Relationship (E/R) modelling. A bit of set theory. Entities have attributes. Relations have arity. Database design and data modelling.

  • The relational data model. Relations are sets of records. Representing entities and relationships as relations. Queries as derived relations. Relations are the basis of SQL.

  • Relational algebra. Relational algebra as an abstract query language. Core operations - selection, projection, product, renaming, and joins.

  • Relational calculus. Relational calculus as an abstract query language that uses notation from set theory. Equivalence with relational algebra.

  • SQL and integrity constraints. An overview of the core of SQL. SQL has constructs taken from both the relational algebra and the relational calculus. Integrity constraints as special queries.

  • Schema refinement I. The evils of redundancy. The benefits of redundancy. Functional dependencies as a formal means of investigating redundancy. Relational decomposition. Armstrong’s axioms.

  • Schema refinement II. Schema normalisation. Lossless-join decomposition. Dependency preservation. Boyce-Codd normal form. Third normal form.

  • Further relational algebra, SQL. SQL is really based on multi-sets (bags). Extending the relational algebra to bags. NULL values as an SQL design error?

  • Transaction management overview. ACID properties - Atomicity, Consistency, Isolation, and Durability. Serialisability in the database context.

  • On-line Analytical Processing (OLAP). When to forget about data normalisation. Beware of buzz-words and the Data Warehouse Death March. More on OLTP versus OLAP. What is a data cube? Data modelling for data warehouses: star schema.

  • XML as a data exchange format. What is XML? XML can be used to share data between proprietary relational databases. XML-based databases?

Objectives

At the end of the course students should

  • be able to design entity-relationship diagrams to represent simple database application scenarios;

  • know how to convert entity-relationship diagrams to relational database schemas in the standard Normal Forms;

  • be able to program simple database applications in SQL;

  • understand the basic theory of the relational model and both its strengths and weaknesses;

  • be familiar with various recent trends in the database area.

Recommended reading

* Date, C.J. (2004). An introduction to database systems. Addison-Wesley (8th ed.).
Elmasri, R. & Navathe, S.B. (2000). Fundamentals of database systems. Addison-Wesley (3rd ed.).
Silberschatz, A., Korth, H.F. & Sudarshan, S. (2002). Database system concepts. McGraw-Hill (4th ed.).
Ullman, J. & Widom, J. (1997). A first course in database systems. Prentice Hall.
Miszczyk, J. and others (1998). Mastering data warehousing functions. (IBM Redbook DB2/400) Chapters 1 & 2 only. http://www.redbooks.ibm.com/abstracts/sg245184.html
Garcia-Molina, H. Data warehousing and OLAP. Stanford University. http://www.cs.uh.edu/ ceick/6340/dw-olap.ppt
London Metropolitan University, Department of Computing. Data warehousing and OLAP technology for data mining. http://learning.unl.ac.uk/csp002n/CSP002N_wk2.ppt