Course pages 2013–14
Databases
Never forget to ask What problem am I solving?
Lecture Slides:
- Lectures 1-9:
- One-slide per page: db_2014.pdf
- Two-slides per page: db_2014_2up.pdf (large file)
- Ken Moody's relational database early history KM-DB-history.pdf
- Lecture 10: Guest lecture. Grant Allen (Google) : BigData_NoSQL_Grant_Allen_Cambridge_2014.pdf
- Lecture 11:
- One-slide per page: db_2014_L11.pdf
- Two-slides per page: db_2014_L11_2up.pdf
- Lecture 12: Reveiw.
A few SQL examples that may clarify the relational implementation of many-to-many and one-to-many relationships (from an ER model).
Relational division example : relational_division.sql.
Some sample supervision question sets and are available for you to use/edit and extend on the Online Teaching Site (courtesy of Andy Rice).
- Problem set 1 : http://ott.cl.cam.ac.uk/questions/sets/310.
- Problem set 2 : Prove claims 1 and 2 from Slide 103. Prove soundness from Slide 105. Prove pseudo-transitivity and decompostion (slide 113) using only Armstrong's axioms.
- Problem set 3 : Extend the "movie release" ER model from slide 146 in order to model multi-region ratings as hinted at in slide 151. Explore the choice between ternary relationships and multiple binary relationships as presented in slides 152 and 153.
Some open source database systems :
- HyperSQL : http://hsqldb.org.
- To use from Java code (as used in Further Java) : Unzip the archive file, change to extracted directory, then launch a database GUI with java -cp {YOURPATH}hsqldb.jar org.hsqldb.util.DatabaseManagerSwing.
- To use the read-execute-print loop use sqltool.jar, java -jar {YOURPATH}sqltool.jar. You may have to set up a sqltool.rc file in your home directory as documented at http://hsqldb.org.
- Postgres : http://www.postgresql.org/
- MySQL : http://www.mysql.com/
- SQLite : http://www.sqlite.org/
Primary sources:
- Tarkski's 1941 paper "On the Calculus of Relations".
- A short biography of Alfred Tarski http://en.wikipedia.org/wiki/Alfred_Tarski.
- Codd's original 1970 paper describing the relational model (reprinted here in 1983).
- A short biography of Edgar Codd http://en.wikipedia.org/wiki/Edgar_F._Codd.
- Chen's original 1976 paper on Enitity-Relationship models.
- A short biography of Peter Chen http://en.wikipedia.org/wiki/Peter_Chen.
- Fagin's definition of multivalued dependencies.
- Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals
iMDB data:
- Raw iMDb data files
- Handy Python package for manipulating iMDB raw data http://imdbpy.sourceforge.net/
Of interest (reading for the fun of it):
- Guest lecture from Lent 2012 on schema migration in the NHS cancer database.
- Dr. Jean Bacon asks What is a key?
- NoSQL Movement : http://en.wikipedia.org/wiki/NoSQL_(concept)
- Berkeley DB : http://en.wikipedia.org/wiki/Berkeley_DB
- Graph Databases : http://en.wikipedia.org/wiki/Graph_database
- Dremel: Interactive Analysis of Web-Scale Datasets
- F1: A Distributed SQL Database That Scales
Never forget to ask What problem am I solving?