Course pages 2014–15
Information Retrieval
Lecturer: Dr S.H. Teufel
No. of lectures: 8
Suggested hours of supervisions: 2
Prerequisite courses: Mathematical Methods for CS (Part IB)
Aims
The course is aimed to characterise information retrieval in terms of the data, problems and concepts involved. IT follows the text book “Introduction to Information Retrieval”, cf. below. The main formal retrieval models and evaluation methods are described, with an emphasis on indexing. Web search is also covered. We also consider clustering as an application case of IR.
Lectures
- Introduction. (Chapters 1; 2.3)
Key problems and concepts. Information need. Boolean Operators.
- Boolean Retrieval and Indexing. (Chapters 2.2; 2.4) and
Implementation of Boolean Operators. Term manipulations; equivalence
classes, stemming.
- Spelling Correction and Tolerant Retrieval. (Chapter
3). Wildcards. Spelling Correction.
- Index Construction and Compression(Chapters 4.2-4.4,
5). BSBI, SPIMI, Distributed indexing. Dictionary
compression. Byte- and bit-level codes.
- The Vector Space Model. (Chapter 6). VSM and Term weighting.
- Evaluation. (Chapter 8, p. 139-148). Test
Collections. Relevance. Precision, Recall, MAP, 11pt interpolated
average precision.
- Clustering. Chapters 16.1-16.4; 17.1-17.2).
Proximity metrics, hierarchical vs. partitional
clustering. Clustering algorithms. Evaluation metrics.
- Link Analysis. (Chapter 21, excluding 21.2.3).
PageRank; Hubs and Authorities.
Objectives
At the end of this course, students should be able to
- define the tasks of information retrieval, web search and
clustering, and the differences between them;
- understand the main concepts, challenges and strategies used in
IR, in particular the retrieval models currently used.
- develop strategies suited for specific retrieval, clustering and
classification situations, and recognise the limits of these strategies;
- understand (the reasons for) the evaluation strategies developed
for the tasks covered.
Recommended reading
* Manning, C.D., Raghavan, P. & Schütze, H. (2008). Introduction to information retrieval. Cambridge University Press. Available at http://nlp.stanford.edu/IR-book/.