Prerequisite courses: None, but basic encounter with Probability is assumed
The course is aimed to characterise information retrieval in terms of
the data, problems and concepts involved. The two main formal
retrieval models and evaluation methods are described. The course then
covers problems and standard solutions in information extraction, and
in question answering.
Information retrieval introduction.
Key problems and concepts. Information need. Indexing model. Examples.
Retrieval models. Boolean Model. Vector Space model. Stemming.
Evaluation methodology. TREC. User experiments. Evaluation metrics.
Search engines and linkage algorithms.
PageRank and Kleinberg's Hubs and Authorities.
Task and evaluation. Lexico-semantic patterns.
Advanced information extraction methods.
Performance criteria and effectiveness measures, test methodology,
Overview of summarisation technology. Extractive versus abstractive summarisation. Evaluation.
At the end of this course, students should be able to
define the tasks of information retrieval, question
answering and information extraction and differences between them
understand the main concepts and strategies used in IR, QA, and IE
appreciate the challenges in these three areas
develop strategies suited for specific retrieval, extraction or question
situations, and recognize the limits of these strategies
understand (the reasons for) the evaluation strategies developed for
these three areas
* Baeza-Yates, R. & Ribiero-Neto, B. (1999). Modern information retrieval. Reading, MA: Addison-Wesley and ACM Press.
* Salton, G. & McGill, M. (1983). Introduction to modern information retrieval. New York: McGraw Hill.
Spärck Jones, K. & Willett, P. (eds.) (1997). Readings in information retrieval. San Francisco: Morgan Kaufmann.