Databases Ia 2024/25: Supervision Sheet 2 Questions

These are suggested questions that supervisors might want to use in their supervisions. They are meant to indicate the type of questions that will be on the Tripos exam.

Q1. In mathematics, what is the difference between a set, a tuple and a list? How is each stored in a relational database? Are the differences fully respected?

Q2a. A relational database has two tables that both have about 100 records in them. Give many distinct reasons why it might be a very bad idea to replace these with a single table that has 100 records in where the records have the combined set of attributes.

Q2b. Data is often entered into a database after a human has filled out a form. Why are such forms used? In Computer Science, there is the concept of a normal form for a datastructure. What are the overlaps?

Q3. What is the difference between a candidate key, a super key and the primary key for a relational table? Does one of these keys have to be used for lookup? If not, why do they exist?

Q4a. Give an example of a transaction abort. Was this abort forced by the system for some reason or what is requested by the client? Do databases reads or writes always have to take place as part of a transaction?

Q4b. What are the differences and similarities between a database hosted on a powerful machine accessed over the network and a distributed database?

Q4c. Redundancy in a database system might exist in the main data model or just as part of the implementation. Discuss these relative forms. Include how they might be affected by the differences mentioned in Q4b. Include also performance consequences for a) the speed of update and b) speed of retrieval.

Q4d. List different reasons why copies of parts of data from a master database might be held elsewhere?

Lecture 6 Questions:

Q10. Define consistency and eventual consistency. Do these terms tend to differ in meaning between a distributed database compared with a monolithic one? Or do they tend to differ more in meaning in the context of an ACID vs BASE database. Give an example in each case of where consistency might be violated.

Q11a. What is meant by semi-structured data? What do data curators do?

Q11b. What, if anything, are the differences between the simple key/value store lectured, a document database and an aggregate database? What needs to be known (or can be checked) about the data stored by a) the DBMS and b) the end user? How can such systems handle malformed data in queries or updates?

Q12. Define sharding and shredding with respect to a document database (they are in the textbook). Why is each potentially good and bad?

Q13. [If time permits] MongoDB is widely used for storing JSON data. Using the terminology defined in this course, summarise its main attributes in 150 words or so.

END. (C) 2020-24 TGG, DJG, University of Cambridge, Computer Laboratory.