https://www.cl.cam.ac.uk/teaching/2324/Databases/clarifications1.txt 0. There has already been an minor edit to the text of question three in Exercise two. Your sheet now reads "You should use the number of word occurrences for a film to compute ..." instead of "You should use the number (total count) of relevant films to compute...". The previous text makes little sense. --------------------------------------------- 1 The requirement that outputs are UTF-8 has now been relaxed, but please do not submit the entire database or files with 100,000 plus lines or archives suffixed zip.zip. --------------------------------------------- 2 Also, do not email links to files hosted on cloud services where the attempt to fetch the file gives a 'permission denied' response: instead the file should in the email as a normal (mime-encoded) attachment. --------------------------------------------- 3 If you have been asked to "Write a query that returns a distinct table of groups of three films directed by the same person, lexographically ordered by the number of votes a film has. Also include the director's name in a fourth column. As an extension, ensure that no two groups of three contain a pair of films in common." The question was overly vague and possibly very difficult in some interpretations. I am sorry about wasting your time. The extension did not make much sense. Full credit will be awarded for any viable interpretation: I've noticed people have been putting a few comments in their SQL about this. The answer should have film titles in it, not movie numbers. Make sure your ordering of film titles is on their votes and not on their movie keys: it's tempting to rely on less than applied to movie keys as part of the filtering operation, but this does not provide lexo sorting of titles! Q. What happens for directors with more than 3 films? Do we return all possible triples of these films (and ignore the extension)? Do we choose just 3 arbitrary films by this director (as to satisfy the extension)? Do we group the films into triples, and discard the remaining films (i.e. if a director has films A, B, C, D, E, F, G, we return the triples {A, B, C} and {D, E, F}, and ignore G)? A. Yes, all possible sorted triples is the baseline answer, whereas the further filtering will end up deleteing G, as per your excellent example. Q. Is the extension necessary? A. No, it is optional. And no longer recommended! Q. What does sorting “lexographically ordered by the number of votes a film has” mean? I’m assuming lexographically is supposed to be lexicographically. But does this means that the three films WITHIN the same row should be sorted lexicographically, or the rows themselves should be sorted lexicographically? If it is the latter, do we compare the first film in each row, and tiebreak by the second and third films? A. Yes, both. The three films within a row should be sorted according to the number of votes they have, highest first. That's not really lex(ic)ographically: that's mostly a simple sort, but tie-breaking on title makes it lexo. The order of rows before the LIMIT is applied should then be a lexi sort based on everything in the result table, as is common for all the exercises. Q. What does “distinct table” mean? Does it mean that for any two triplets of movies R1 = (a1,b1,c1) and R2 = (a2,b2,c2), we must ensure that R2 is not a reordering of R1? A. Yes, but the 'distinct' distinction is moot, I suppose, if the lexo sorting has required a unique ordering in the first place. Q. What does “lexicographically ordered by the number of votes” mean? Given this if i have a record A, B, C, do you want score(A) > score(B) > score(C) A. Yes, it's a vote order sort with tiebreak on film title ordering, but you could ask for full credit if you miss that bit out since it was not clear. Q. Am I free to choose the relative ordering of movies within each row? A. No, I don't think so? Each film is related to a specific number of votes and the lexo ordering of (vote, title) pairs is required within the row. Q. Does “no two groups of three contain a pair of films in common” mean that their intersection has at most 1 element? A. Yes, I think so. Please ignore the extension. It does not make much sense (see above). Sorry. Q. Also, am I required to do the extension “ensure no two triplets share a pair in common” in order to receive full credit for my tick? A. No, this should have been marked as an optional extension. Also it does not make much sense (see above). Sorry. END