Graph databases: Practical Exercises

In Moodle, please submit three files: exercise2a.cypher, exercise2b.cypher, and exercise2c.cypher. If you haven't completed all parts, you can submit blank files. However, we recommend developing and debugging on your own machine before going to Moodle.

Graph databases: Solution notes

This document provides some model answers to help supervisors. Every exercise is designed so that a correct answer produces a unique output. But of course there are many ways to write a correct query.

Exercise 2.a

Write a cypher query to return name, total where name is the name of an actor and total is the number of movies they act in.

   YOUR-CODE-GOES-HERE
   order by total desc, name
   limit 10;

Solution notes:

One possible solution is

match (p:Person) -[:ACTED_IN]-> (m:Movie)
return p.name as name, count(*) as total
order by total desc, name
limit 10;

This query produces the output:

+-----------------------------+
| name                | total |
+-----------------------------+
| "Robert Downey Jr." | 19    |
| "Brad Pitt"         | 17    |
| "Amy Adams"         | 14    |
| "Christian Bale"    | 14    |
| "Jake Gyllenhaal"   | 14    |
| "Leonardo DiCaprio" | 14    |
| "Mark Ruffalo"      | 14    |
| "Ryan Gosling"      | 14    |
| "Tom Hanks"         | 14    |
| "Matt Damon"        | 13    |
+-----------------------------+

10 rows available after 18 ms, consumed after another 0 ms

Exercise 2.b

Write a cypher query that returns name, roles1, roles2 where name is the name of an actor, roles1 are the roles played in the movie The Matrix Reloaded, and roles2 are the roles played in the movie John Wick.

   YOUR-CODE-GOES-HERE
   order by name, roles1, roles2;

Solution notes:

One possible solution is

match (m:Movie{title : 'The Matrix Reloaded'})
       <-[r1:ACTED_IN]- (p) -[r2:ACTED_IN]->
      (n:Movie {title : 'John Wick'})
return p.name as name, r1.roles as roles1, r2.roles as roles2
order by name, roles1, roles2;

This query produces the output:

+------------------------------------------+
| name           | roles1  | roles2        |
+------------------------------------------+
| "Keanu Reeves" | ["Neo"] | ["John Wick"] |
+------------------------------------------+

1 row available after 6 ms, consumed after another 0 ms

Exercise 2.c

In the graph tutorial we saw the cypher query for computing each bacon number associated with people in our database and the total with that bacon number:

match path=allshortestpaths(
          (m:Person {name : "Kevin Bacon"} ) -[:ACTED_IN*]- (n:Person))
     where n.person_id <> m.person_id
     return length(path)/2 as bacon_number,
            count(distinct n.person_id) as total
order by bacon_number;

Just as a given movie can be associated with many actors, a given movie can be associated with many producers. Modify the above query to compute the spielberg_number that starts with "Steven Spielberg" and counts the people using the coproducer relationship rather than a coactor relationship. Don't forget to rename rename bacon_number to spielberg_number!

Solution notes:

One possible solution is

match path=allshortestpaths(
          (m:Person {name : "Steven Spielberg"} ) -[:PRODUCED*]- (n:Person))
     where n.person_id <> m.person_id
     return length(path)/2 as spielberg_number,
            count(distinct n.person_id) as total
order by spielberg_number;

This query produces the output:

+--------------------------+
| spielberg_number | total |
+--------------------------+
| 1                | 5     |
| 2                | 12    |
| 3                | 43    |
| 4                | 57    |
| 5                | 72    |
| 6                | 79    |
| 7                | 41    |
| 8                | 50    |
| 9                | 42    |
| 10               | 28    |
| 11               | 8     |
| 12               | 1     |
+--------------------------+

12 rows available after 134 ms, consumed after another 0 ms