Databases Ia 2023/24: Supervision Sheet 2 Questions

These are suggested questions that supervisors might want to use in their supervisions. They are meant to indicate the type of questions that will be on the Tripos exam.

Q1. In mathematics, what is the difference between a set, a tuple and a list? How is each stored in a relational database? Are the differences fully respected?

Q2. A relational database has two tables that both have about 100 records in them. Give many distinct reasons why it might be a very bad idea to replace these with a single table that has 100 records in where the records have the combined set of attributes.

Q3. What is the difference between a candidate key, a super key and the primary key for a relational table? Does one of these keys have to be used for lookup? If not, why do they exist?

Q4a. Do databases reads or writes always have to take place as part of a transaction?

Q4b. What are the differences and similarities between a database hosted on a powerful machine accessed over the network and a distributed database?

Q4c. Redundancy in a database system might exist in the main data model or just as part of the implementation. Discuss these relative forms. Include how they might be affected by the differences mentioned in Q4b. Include also performance consequences for a) the speed of update and b) speed of retrieval.

Q4d. List different reasons why copies of parts of data from a master database might be held elsewhere?

Lecture 6 Questions:

Q10. Define consistency and eventual consistency. Which might be easier to support in a distributed database compared with a monolithic one? Give two examples of where consistency might be violated, firstly in a monolithic database with concurrent operations and secondly in a monolithic database with a redundant schema.

Q11a. What is meant by semi-stuctured data? What do data curators do?

Q11b. What, if anything, are the differences between the simple key/value store lectured, a document database and an aggregate database? What needs to be known (or can be checked) about the data stored by a) the DBMS and b) the end user? How can such systems handle malformed data in queries or updates?

Q12. Define sharding and shredding with respect to a document database (they are in the textbook). Why is each potentially good and bad?

Q13. [If time permits] MongoDB is widely used for storing JSON data. Using the terminology defined in this course, summarise its main attributes in 150 words or so.

END. (C) 2020-23 TGG, DJG, University of Cambridge, Computer Laboratory.