Computer Laboratory

Course pages 2011–12

Information Retrieval

Material for Post-Lecture Exercises:

For Lecture 2:Instructions for building an index using Unix tools: IR practical.

Part I. Explanation: this was used as an exercise for MPhil students with little programming experience. You will also need some data (which is tar'ed and gzipped). You should find 616 files there. This is of course an unrealistically small ``document collection'', but it's fine for studying the basics.

For Lecture 3:Follow Parts II and III of the instructions above, in oder to build two simple retrieval models.

For Lecture 4: Here are the rat genes tables (the second column is irrelevant and can be ignored; you can consider it as part of the name of the gene). Try a clustering toolkit such as cludo on them.