Prerequisite courses: None, but basic encounter with Probability is assumed

Aims

The course is aimed to characterise information retrieval in terms of
the data, problems and concepts involved. The main formal
retrieval model and the main evaluation methods are described. The course then
covers problems and standard solutions in information extraction, and
in question answering.

Lectures

Information retrieval introduction.
Key problems and concepts. Information need. Indexing model. Examples.

Retrieval models. Boolean model. Vector Space model. Stemming.

Evaluation methodology. TREC. User experiments. Evaluation metrics.

Search engines and linkage algorithms.
PageRank and Kleinberg's Hubs and Authorities.

Information extraction.
Task and evaluation. Lexico-semantic patterns.

Advanced information extraction methods.
Bootstrapping. Learning.

Question answering.
Performance criteria and effectiveness measures, test methodology,
established results.

Overview of summarisation technology. Extractive versus abstractive summarisation. Evaluation.

Objectives

At the end of this course, students should be able to

define the tasks of information retrieval, question
answering and information extraction and differences between them

understand the main concepts and strategies used in IR, QA, and IE

appreciate the challenges in these three areas

develop strategies suited for specific retrieval, extraction or question
situations, and recognize the limits of these strategies

understand (the reasons for) the evaluation strategies developed for
these three areas

Recommended reading

* Baeza-Yates, R. & Ribiero-Neto, B. (1999). Modern information retrieval. Reading, MA: Addison-Wesley and ACM Press.
* Salton, G. & McGill, M. (1983). Introduction to modern information retrieval. New York: McGraw-Hill.
Spärck Jones, K. & Willett, P. (eds.) (1997). Readings in information retrieval. San Francisco: Morgan Kaufmann.