Databases Ia 2022/23: Supervision Sheet 2B Questions

These are suggested questions that supervisors might want to use in their supervisions. They are meant to indicate the type of questions that will be on the Tripos exam.

These questions (Q6 onward) relate to Lecture 5+ material. Owing to the spaced-out timetabling of the lectures this year, these questions have been split out to a sheet 2B. However, if sticking to the total-of-3-supervisions recommendation for this course, sheet 2B would likely be included in the second supervision along with 2A.

Q6. Define consistency and eventual consistency. Which might be easier to support in a distributed database compared with a monolithic one? Give two examples of where consistency might be violated, firstly in a monolithic database with concurrent operations and secondly in a monolithic database with a redundant schema.

Q7. What does it mean for data to be functionally-dependent on the key it is stored against? Give an example of data or a schema where this approach is violated? Is computing and storing an index a violation? What about computing expensive metrics of the data (such as performing NLP). Is this a good idea? Which of these introduces redundancy, if any? Is this good or bad?

You are recommended to read the N*SQL chapters of Lemahieu (chapter 11).

Q8. What, if anything, are the differences between a key/value store, a document database and an aggregate database? What needs to be known (or can be checked) about the data stored by a) the DBMS and b) the end user? How can such systems handle malformed data in queries or updates?

Q9. Define sharding and shredding with respect to a document database. Why is each potentially good and bad?