Up: Easter Term 2002: Part
Previous: Computer Vision
  Contents
Lecturer: Dr G.M. Bierman
(gmb@cl.cam.ac.uk)
No. of lectures: 12
Aims
The aims of this course are to introduce the principles, models and
applications of database management. The course will introduce some
typical database applications to demonstrate enterprise evolution.
Three requirements arise from these examples: data independence; a
formally expressed data model (schema); support for transactions,
including durability. The value of database management systems that
satisfy these requirements is noted. The course is illustrated with
three important data models: the network (CODASYL) model, E.F. Codd's
relational model, and the ODMG standard for a language-independent
object data model. Practical work in SQL will be offered using
IBM's DB2 DBMS above Linux.
Lectures
This is the syllabus for 2000-2001. There may be some changes for
2001-2002.
- The nature of a database.
Essential features of databases, examples. Formal data models:
resilience to change, data independence. The role of metadata.
Mutation rate of data items. Benefits and cost of indexing.
- The architecture of a DBMS.
ANSI/SPARC architecture for DBMS. The need for a Data Definition
Language. Brief history of Data Models and DBMS. Corporate
evolution and the need for Standards. Schema evolution.
- Transaction processing, physical database design.
Recap of ACID properties of transactions. Rotating storage devices,
space allocation, the need for synchronised atomic change. Indexing
and inverted files (B-trees assumed). Extensible hashing schemes.
- Database design.
Entities and attributes. Mathematical foundations: sets and relations;
relation product and natural join; theta-joins. Functional dependency
between attribute sets. Single and multi-attribute keys. Relationships
between entities. Entity-Relationship (E/R) diagrams. Degree and
multiplicity of relationships. Entity types as classes. Object-based
design methodology. UML.
- Network model of data.
Historical review of DBMS. CODASYL/DBTG proposals as an example of
the ANSI/SPARC architecture. Bachman diagrams. Storage diagrams and
explicit traversal. Record identifiers (DB keys). Control of record
placement. Lack of data independence.
- Relational model: basic semantics.
Codd's 1970 paper. Data independence. Domains and data types.
n-tuples over domains. First Normal Form. Keys as user level tuple
identifiers. Data independence. Describing entities using tuples.
Describing inter-entity relationships by tuples. Research into
efficient implementation. System-R. Duplicate elimination.
Relations as sets, Tables as bags. Transaction support, Roll-Back.
Work on query optimisation. SQL.
- Relational algebra.
PRTV and ISBL. Boolean operations on relations. Key-based semantics.
Cartesian product. Select, Project and (equi-)Join. Semi-joins.
Examples. Techniques for implementing joins. Cardinality.
Views as expressions in relational algebra. INGRES, access control.
- Relational schema design.
Functional dependence between attributes. Data redundancy for access,
update and insertion anomalies. Second and Third Normal Form (3NF).
Boyce-Codd normal form. Examples. Loss of referential integrity.
Multi-valued dependencies. Fourth Normal Form (4NF).
- Relational calculus and SQL.
Relationally complete languages. SQL2 as a standard. Combining DDL
and DML functionality. Multi-set semantics. Null values and Outer
Joins. SQL2 Syntax and semantics. Aggregates and GROUP_BY.
Examples.
- Semantic considerations in the SQL2 data model.
Entity integrity. Modelling application datatypes, repeating groups.
Collection types. Subdomains and inheritance. Generalisation.
Referential integrity and Foreign keys. The need for transitive
closure. Fixed-point semantics. Deductive databases. DATALOG.
SQL3.
- Distributed applications.
Legacy databases. Data mining. Corporate restructuring. Federation.
Identifying entities, naming. Data collection and data warehouses.
On-line Analytical processing (OLAP). Multi-dimensional databases,
data cubes. Materialised views. Entity representations as objects.
(O)ODBMS. Object Data Management Group (ODMG). The Object Database
Standard ODMG 2.0. The ODMG object model.
- The Object Database Standard ODMG 2.0.
The ODMG Object Model. ODMG/ODL as a Data Definition Language.
Metadata and the ODL Schema Repository. Type system: atomic objects
(user-defined), collection objects, literal types, structures.
NULL-extended types. Tables. Attributes. Object identifiers as
values. Relationships and their inverses. Operations and exceptions.
Transactions. Lock types, 2PL as the default concurrency control
policy. Databases. OQL: an object query language extending the SQL
query sublanguage. Update by method invocation. Navigation via path
expressions. WHERE clauses. Definition as a view mechanism.
Objectives
At the end of the course students should
- be able to design entity-relationship diagrams to represent simple
database application scenarios
- know how to convert entity-relationship diagrams to relational
database schemas in the standard Normal Forms
- be able to program simple database applications in SQL
- understand both the strengths and the weaknesses of the relational
model of data
- be familiar with the fundamentals of object database management
Recommended books
Ullman, J.D. & Widom, J. (1997). A First Course in Database Systems.
Prentice-Hall.
Korth, H.F. & Silberschatz, A. (1991). Database System Concepts.
McGraw-Hill (2nd ed.).
Date, C.J. (1995). An Introduction to Database Systems, vol. 1.
Addison-Wesley (6th ed.).
Kent, W.T. (1978). Data and Reality. North-Holland.
Cattell, R.G.G. (ed.) (1997). The Object Database Standard ODMG 2.0.
Morgan Kaufmann.
Up: Easter Term 2002: Part
Previous: Computer Vision
  Contents
Christine Northeast
Tue Sep 4 09:34:31 BST 2001