Databases
Change Log:
- 7 Oct: Page installed.
- 24 Oct: update supervisions.
- 26 Oct: update graph database.
- 28 Oct: added comments on exercise 1b and notes on transitive closure
- 2 Nov : (None of the following are examinable). added notes_on_bacon_number.txt and links on various kinds of graph statistics: Centrality and Community Structure. These ideas have been the basis of several past Part-II projects! A fantastic sourse of data is Stanford Large Network Dataset Collection.
Lecture notes
- databases_2021.pdf (one slide per page)
- Understanding the Wisconsin accent
- Notes on basic set theory
- notes on transitive closure
Recommended Text
- The recommended text is now online for Cambridge students.
Suggested supervisions
- Supervision 1
- Supervision 2
- Supervision 3: Past tripos questions! (Note: first 1A version of this course was in 2017.)
Practical work
On 27 October and 3 November, I will select 12 students for online (Zoom) ticking. Check your email at 10am.- Relational (due Oct 26, before 23:59)
- Relational Tutorial
- The data : movies-relational.zip
- Configuration file
- Tick 1
- comments_on_exercise1c.txt
- notes_on_bacon_number.txt
- Document-oriented (due by Nov 2, before 23:59)
- DOCtorWho Tutorial
- The data : data.zip
- Tick 2
- Graph-oriented (No tick this year!)
- See the Neo4j tutorial from last year: Graph Tutorial 2020.
- I was not able to get past the new authentication mechanisms of the latest version of Neo4j.
Nothing below is examinable!
The code for generating our three database instances can be found here https://github.com/Timothy-G-Griffin/build_databases.cst.cam.ac.uk.git. Comments appreciated.
A book on graph databases: Neo4j_Graph_Algorithms.pdf.
Some open source relational database systems
- HyperSQL : http://hsqldb.org.
- Postgres : http://www.postgresql.org/
- MySQL : http://www.mysql.com/
- SQLite : http://www.sqlite.org/
A few "NoSQL" pointers
- NoSQL Movement : http://en.wikipedia.org/wiki/NoSQL_(concept)
- A list of NoSQL database systems : http://nosql-database.org/
- Berkeley DB : http://en.wikipedia.org/wiki/Berkeley_DB
- Graph Databases : http://en.wikipedia.org/wiki/Graph_database
- Dremel: Interactive Analysis of Web-Scale Datasets
- F1: A Distributed SQL Database That Scales
Further reading
- Tarkski's 1941 paper "On the Calculus of Relations".
- A short biography of Alfred Tarski http://en.wikipedia.org/wiki/Alfred_Tarski.
- Codd's original 1970 paper describing the relational model (reprinted here in 1983).
- A short biography of Edgar Codd http://en.wikipedia.org/wiki/Edgar_F._Codd.
- Chen's original 1976 paper on Enitity-Relationship models.
- A short biography of Peter Chen http://en.wikipedia.org/wiki/Peter_Chen.
- Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals