Natural Language and Information Processing Research Group

Current or recent NLIP projects

AVeriTeC: Automated Verification of Textual Claims (ERC consolidator grant, 2021-2025)
Opening Up Minds: Engaging Dialogue Generated From Argument Maps (EPSRC, 2021-2023)
MONITIO: Bringing to the Market a New Generation of AI-Powered Media Monitoring Tools (EU, H2020, 2021-2023)
CybercrimeNLP (CC-NLP)
Giving Voice to Digital Democracies
The Institute for Automated Language Teaching and Assessment (ALTA)
Pandemic MT (coming soon)

Past projects

The group's research projects have included ones on language processing resources and tools, on logic and formalisms, on front ends e.g. for database and unstructured information access, and on speech processing. More recent and current projects, funded under both UK and European Programmes, have involved further development of tools and processors, automatic summarising, text and spoken message retrieval; natural language processing for formal specifications; and the acquisition of lexical knowledge and construction of multilingual lexical knowledge bases. Projects with individual pages are:

DILiGENt: Domain-Independent Language Generation (EPSRC, 2015-2018)
SUMMA: Scalable Understanding of Multilingual Media (EU-H2020, 2016-2019)
EneMILP: Non-Monotonic Incremental Language Processing (EPSRC, 2018-2019)
The What-If Machine (WHIM)
Distributional Compositional Semantics for Text Processing (DisCoTex)
A Unified Model of Compositional and Distributional Semantics: Theory and Applications
The Education First-Cambridge Learner Corpus of English - a data driven approach to second language learning (see Anna Korhonen's page)
PANACEA - Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language Technologies
SpaceBook - Spatial & Personal Adaptive Communication Environment
FAUST - Feedback for User Adaptive Statistical Translation
Computational Natural Language Processing and the Neuro-Cognition of Language (no project web-page, see Anna Korhonen's page).
CRAB: Using Text Mining to Aid Cancer Risk Assessment
Integrating pragmatic insights with HPSG (no project web-page, see Ann Copestake's page).
Applying Computational Semantics (no project web-page, see Ann Copestake's page).
Delph-in interfaces project, funded by Boeing (no project web-page, see Ann Copestake's page).
Lexical Acquisition for the Biomedical Domain (no project web-page, see Anna Korhonen's page).
Studying the appropriateness of different formulations of a discourse relation in context (no project web-page, see Advaith Siddharthan's page).
SciBorg Extracting the Science from Scientific Publications: see also the SciBorg wiki page.
FlySlip: Integrating Literature, Experiments and Curation in Drosophila Genomics Research
CitRAZ: Rhetorical Citation Maps and Domain-Independent Argumentative Zoning
ACLEX: Accurate and Comprehensive Lexical Classification for Natural Language Processing Applications
Accurate and Efficient Parsing of Biomedical Text
DeepThought
Multiword expressions
Robust Accurate Statistical Parsing (RASP)
Alvey Natural Language Tools
Acquilex
The Cambridge/Acquilex Lexical Database System

Further details on research and PhD topics can be found on individual homepages which can be accessed via the list of NLIP people.

We constructed a research demo system which builds web pages automatically from group bibliographies: see the (slightly outdated) output for the NLIP group.

Computer Laboratory

Current or recent NLIP projects

Past projects