Supervision 3 Questions

These are suggested questions that supervisors might want to use during supervisions.

Lecture 6 - Further SQL Et Cetera

  1. Explain how SQL relates to multisets and when to use the DISTINCT keyword.

  2. On our movie database, will the SQL query

       select person_id, movie_id, position
       from credits
       where not((not (position is null)) or position = 17); 
    

    return the same results as the query

       select person_id, movie_id, position from credits
       where position <> 17; 
    

    Explain your answer.

  3. What is the difference between an inner and an outer join? Note there are left and right variants.

  4. Suppose that a and b are 3-valued expressions in SQL (true, false, or null). Can the expression (a and b) is null be rewritten as an equivalent boolean combination of a, b, a is null, and b is null?

  5. In boolean logic, the expression not(a and b) always has the same value as (not a) or (not b). Is that true in 3-valued logic?

    What about the expression not(a or b) and the expression (not a) and (not b)?

    Is the expression a or (not a) always true in 3-valued logic? If not, can you extend this axiom to make it always true?

6. The GROUP BY construct requires two functions: a first to partition the data into groups and a second to reduce the various values in a group into a single value (known as reduction of a vector to a scalar). Which of the following operations are generally suitable for use in each context: addition, subtraction, roughly-equal, cosine, maximum, count, equals-to, greater-than, not-equal-to, average, identity? For each function, mention its abstract properties (eg. commutative, has an inverse). (Last years' Tripos).

  1. What is the longest path that has no loops which is possible in a graph with v nodes and e directed arcs?

End of L7 Material and L8 Graph Databases

The graph database tutorial is available here https://www.cl.cam.ac.uk/teaching/2324/Databases/djg-materials/graph-highlighted.html

Some versions of neo4j come with an example movie database, rather like the snapshot we've been using. We suggest you load our snapshot instead of using the default one. Have a play with it. Note that neither details of Neo4j or Cypher syntax will be asked about in Tripos questions.

To clarify further: the content of the relational tutorial is examinable, together with everything lectured in the 8 lectures. The content of the further two tutorials (document and graph), where it extends beyond what was lectured, is not examinable. The formal syllabus is on the course web page https://www.cl.cam.ac.uk/teaching/current/Databases/

  1. Summarise the notion of 'closure' in terms of repeated function composition and fixed points. Give a real-world example of transitive closure.

  2. Optional question: Explain why concepts of fixed points and closure are relevant to finding the results of recursive SQL queries. [Recursive SQL is not examinable, so perhaps just discuss this with your supervisor.]

  3. The following function is a least-fixed point finder coded in ML. Give a non-recursive ML function that that uses it to find the lowest natural number equal to or above the square root of argument S:

    let lfp f =
      let rec search n = if f(n)=n then n else search (n+1)
      in search 1 end
    let int_sqroot S =  ... your code that uses lfp ...
    

11. Neo4j edges have a unique identifier, a type and attributes. Write 3 example Neo4j queries that each illustrate looking up on each of these.

[Repeat: Neither details of Neo4J or Cypher syntax will be asked about in Tripos questions.] [NB. Experienced computerists will choke on the word 'type' above: they would be happier to say all edges have the same type and call this field a 'kind'.]

  1. What is a ternary relationship and can it be stored (easily) in a graph database? Distinguish between relationships between three different types of entity and 3-way relationships that relate fewer than three domains.
  2. Can a many-to-many relation easily be stored in a relational database? Does a graph database make it easier? Or is this just silly folklore?
  3. There is no 14 in this edition.

Further Questions to discuss with your supervisor

15. Suppose that we are presented with only the document database. It contains a tables of movies and a table of people in 23/24. Could we use this database to help us reconstruct an Entity-Relationship model of the data? Could you even consider automating this process?

  1. Why does a graph database typically store just one graph? (Last years' Tripos.)
  2. You may have answered DOC.3 for your second Assessed Exercise. Perhaps you used a keyword lookup approach to answering it. Would this be possible to phrase in the graph database of movies?
  3. Why is it generally helpful if database updates are idempotent? Give an example schema before and after being modified so that more of its updates can be phrased idempotently. [NB. There's not much in this course about updates outside of the transaction topic, so don't worry if you cannot think of an answer: again, this is more of a topic for part Ib Concurrency.]

END. (C) TGG 2021 (C) 2022/23 DJG, University of Cambridge, Computer Laboratory.