Computer Laboratory

Course pages 2012–13

Language and Concepts

Principal lecturer: Prof Ann Copestake
Taken by: MPhil ACS, Part III
Code: R207
Hours: 16
Class limit: 8 students
Prerequisites: L100 Introduction to Natural Language Processing


The notion of a concept is crucial to the way that we think about representation of language and human cognition. We assume that it makes sense to say that humans have concepts such as thumb, walk, water, blue, pencil, two, employee, democracy and DNA and that they must somehow acquire these concepts. For concepts which are pre-linguistic, the connection between a mental representation and the terms their native language uses for those concepts must be acquired. AI and NLP/computational linguistics are the most obvious areas of computer science concerned with the representation of concepts, but any other area of CS which concerns itself with modelling the real world in a way which is comprehensible to humans also requires some notion of concept. However, other than the idea that concepts are somehow related to lexical items, there is no agreement on what a concept actually is, or how to represent concepts in general, or even how to represent any particular concept. That this matters is shown, for instance, by the difficulty of schema matching (relating independently-constructed databases) or indeed in building the semantic web. The aim of this course is to start from a computational perspective but to give an idea of the interdisciplinary issues involved: it is intended to provide an introduction for students interested in AI or computational linguistics but also to be accessible to people primarily working on other CS topics.


The course will have eight 2-hour sessions, listed below. Most of these will be seminars rather than lectures: students will be expected to do the assigned reading before the session and come prepared to discuss the material. Most sessions will consist of some general introduction followed by an in-depth examination of some particular piece of research described in one or more papers. In the final two ‘open’ sessions, students will present selected research papers on individual topics to be decided in consultation with the module leader.

  • Introduction and overview of the course. Informal concept representation: dictionaries, encyclopedias and folksonomies. Computational exploitation of these resources.
  • Concepts in computer science. Description logics and their use in the semantic web. Terminology databases, taxonomies and ontologies in eScience.
  • Concepts in logic and linguistics. Concepts and compositional semantics. Quantification and number in natural languages.
  • Concepts in computational linguistics. Inference and concepts. Distributional semantics and its relationship to symbolic approaches to concepts.
  • Concepts in cognitive science and philosophy. Grounding. Human concept acquisition and the innateness debate.
  • Concepts in neuroscience. Experimental evidence concerning the brain's encoding of word meaning.
  • Open session 1. Student presentations.
  • Open session 2. Student presentations.


On completion of this module, students should:

  • Have a general understanding of the notions of concept in different disciplines and an in-depth understanding of selected topics;
  • Understand the main advantages and disadvantages of the approaches that have been proposed for concept representation;
  • Be able to explore the research literature on their own (including relevant work in other disciplines) and to summarise research papers on a topic.


The students will be expected to complete the required reading before each seminar (15 hours). All students attending will prepare one 20-30 minute presentation on a research paper which will involve more in-depth reading on a particular topic (10 hours). They will complete a 4,000 word essay on the same topic as their presentation (15 hours). The choice of topic will be discussed with the module leader. Each student will work on a different topic.


It will be mandatory for students to give a presentation, but the presentation will not be marked. The module will be assessed by a 4,000 word essay, marked by the module leader using a percentage score. The essay will be due two weeks after the end of the module (subject to timetabling).

Recommended reading

Jurafsky, D. & Martin, J. (2008). Speech and language processing. Prentice-Hall (2nd edition).

This will be used for general background. Other readings will be announced before the start of the module.


R207 Language and Concepts cannot be taken in conjunction with P35 System on Chip Design and Modelling in 2012-13.

Class limit: 8 students