Relational databases: Solution notes

This document provides some model answers to help supervisors. Every exercise is designed so that a correct answer produces a unique output. But of course there are many ways to write a correct query.

Exercise 1

Complete the following query template to produce a query that returns rows name, total where the name column is the name of a person, and the total column indicates the number of movies of the genre Drama for which that person is a writer.

   SELECT name, count(*) AS total
      YOUR-CODE-GOES-HERE
   GROUP BY name
   ORDER BY total desc, name
   LIMIT 10;

Solution notes:

SELECT name, count(*) AS total
FROM people
JOIN credits  ON credits.person_id = people.id AND type = 'writer'
JOIN genres ON genres.movie_id = credits.movie_id AND genre = 'Drama'
GROUP BY name
ORDER BY total desc, name
LIMIT 10;

On the small data set this query produces the output:

NAME                TOTAL
------------------  -----
Coen, Ethan             3
Coen, Joel              3
Hornby, Nick (I)        2
Linklater, Richard      2
Morgan, Peter (I)       2
Russell, David O.       2
Sorkin, Aaron           2
Affleck, Ben            1
Anderson, Wes (I)       1
Arata, David (I)        1

Exercise 2

Now modify your solution for Exercise 1 so that names and totals are associated with those writers that write ONLY for movies associated ONLY with the genre Drama.

Solution notes:

SELECT name, count(*) AS total
FROM people
JOIN credits  ON credits.person_id = people.id AND type = 'writer'
JOIN genres ON genres.movie_id = credits.movie_id AND genre = 'Drama'
WHERE name NOT IN (
   SELECT name
   FROM people
   JOIN credits ON credits.person_id = id and type = 'writer'
   JOIN genres ON genres.movie_id = credits.movie_id AND genre <> 'Drama'
)
GROUP BY name
ORDER BY total desc, name
LIMIT 10;

On the small data set this query produces the output:

NAME                    TOTAL
----------------------  -----
Barber, Lynn (II)           1
Cretton, Destin Daniel      1
Donoghue, Emma (I)          1
Koreeda, Hirokazu           1
Lee, Chang-dong             1
Mungiu, Cristian            1
Negin, Oleg                 1
Nichols, Jeff (VI)          1
Radulescu, Razvan           1
Zvyagintsev, Andrey         1

Exercise 3

The position column of the credits table indicates an actor's rank on the billing of a movie. So for example, position = 1 represents a billing of a top star. A movie may have several top stars sharing the top billing. Your task is to complete the following query template so that it returns rows of the form name1, name2, title where the names are of two (different) actors sharing the top billing in the movie having the associated title. The first name should be lexicographically less than the second.

   SELECT P1.name AS name1, P2.name AS name2, title
      YOUR-CODE-GOES-HERE
   ORDER BY name1, name2, title ;

Solution notes:

SELECT P1.name AS name1, P2.name AS name2, title
FROM movies
JOIN credits AS C1 ON C1.position = 1
JOIN credits AS C2 ON C1.movie_id = C2.movie_id
JOIN people  AS P1 ON P1.id = C1.person_id
JOIN people  AS P2 ON P2.id = C2.person_id AND C2.position = 1
WHERE P2.id <> P1.id
      AND movies.id = C1.movie_id
      AND P1.name < P2.name
ORDER BY name1, name2, title ;

On the small data set this query produces the output:

NAME1            NAME2                  TITLE
---------------  ---------------------  ----------------------
Owen, Clive (I)  Yacuzzi, Juan Gabriel  Children of Men (2006)

Common mistakes: Attempting this with only one instance of credits or people tables. Forgetting one of the clauses that constrain the result.

Once we cover graph databases, you might want to point out how easy it is to write this in Cypher:

match (p1:Person)-[r1:ACTS_IN]->(m:Movie)<-[r2:ACTS_IN]-(p2:Person)
where p1.person_id < p2.person_id
  and r1.position = 1
  and r2.position = 1
return p1.name as name1, p2.name as name2, m.title as title;