Course pages 2012–13
Information Retrieval
Lecturer: Dr S.H. Teufel
No. of lectures: 8
Suggested hours of supervisions: 2
Prerequisite courses: a basic encounter with Probability is assumed
Aims
The course is aimed to characterise information retrieval in terms of the data, problems and concepts involved. The main formal retrieval models and evaluation methods are described. Web search is also covered. The course then turns to problems and standard solutions in two related areas, clustering and text classification.
Lectures
- Introduction.
Key problems and concepts. Information need. Indexing model. Examples.
- Retrieval models I. Boolean model. Stemming and other Term Manipulations.
- Retrieval models II. Vector Space Model and Term Weighting.
- Clustering.
Proximity metrics, hierarchical vs. partitional
clustering. Clustering algorithms. Evaluation metrics.
- Retrieval models III. Advanced Models: Dimensional Reduction. Language Models. Relevance Feedback. Query Expansion.
- Search engines and linkage algorithms.
PageRank; Kleinberg’s Hubs and Authorities.
- Evaluation Strategies. Test
Collections. Precision, Recall, and more complex evaluation
metrics.
- Question Answering. Task Definition and Evaluation. Three Algorithms for Question Answering.
Objectives
At the end of this course, students should be able to
- define the tasks of information retrieval, web search,
clustering and text classification and differences between them;
- understand the main concepts, challenges and strategies used in
IR, in particular the retrieval models currently used.
- develop strategies suited for specific retrieval, clustering and
classification situations, and recognise the limits of these strategies;
- understand (the reasons for) the evaluation strategies developed
for these three areas.
Recommended reading
* Manning, C.D., Raghavan, P. & Schütze, H. (2008). Introduction to information retrieval. Cambridge University Press. Available at http://www-csli.stanford.edu/~hinrich/information-retrieval-book.html.