Project Ideas

The project ideas below can be tailored for both Part II and Part III/MPhil students, depending on their scope. If you're interested in working on any of them (or on a related topic) feel free to get in touch: ceren.kocaogullar@cl.cam.ac.uk.

Contextual Integrity for LLM Agents

Contextual Integrity (CI) is a theory of privacy which says that information should only flow in ways appropriate to the social context. This idea can be useful for privacy in agentic AI systems (LLMs that plan and act). For example, if an agent is handling medical data, it should not reveal it to tools or subagents outside the permitted context. Here are some relevant papers about building agents that apply CI principles to agents [1, 2, 3]. There are many directions you can go from there papers, including:

It would be interesting to collect user answers and fine-tune a supervisor model based on them. You could do a user study to evaluate the success of your model in not sharing private information in inappropriate contexts.
Looking beyond the single-user case, how might we aggregate this data across users with similar societal norms in a privacy-preserving way?
Related to the above, how to identify such groups with high accuracy without compromising privacy?
If you have any other ideas about applying CI to LLMs, I am open to suggestions!

Securing the Model-Context Protocol (MCP)

MCP allows agents to use tools by exchanging structured prompts. However, attackers might exploit misconfigured tools, inject malicious contexts, or exfiltrate data via outputs. In this project, you can simulate and analyse such attacks, then implement defenses such as tool sandboxing, digital signatures, or prompt-level policy checks. You can evaluate the effectiveness of these defences in different realistic scenarios.

Privacy-Preserving Logging for Agentic AI Systems

LLM systems are often hard to debug without logs, but logging can leak sensitive data. In this project, you can design a privacy-aware logging system for an agentic framework (like LangChain or AutoGen), which supports features like redaction, anonymisation, or logging only derived summaries. You can implement this system and compare it to naive logging in terms of both privacy risk and developer usability.

Capability Tokens for Tool Access in LLM Agents

LLM agents often invoke external tools (e.g. APIs, databases) without fine-grained access control. This project is about implementing a capability-based security model, where each tool invocation requires an explicit token granted by the system. Tokens can encode permissions (e.g. read-only, rate-limited). You can evaluate how this system reduces the risk of unintended or malicious tool usage, and whether it remains usable in complex tasks.

Adversarial Testing of LLM Agent Toolchains

LLM agents often rely on external tools (e.g. APIs, scripts, calculators) to complete tasks; but what if those tools are buggy, misconfigured, or even malicious? In this project, you can develop a testing framework that injects adversarial tools into an agent’s environment. For example, you might simulate a tool that returns misleading data, causes an exception, or attempts to leak memory contents. You can then study how different types of agents respond (e.g. do they fail safely or continue with incorrect assumptions?), and propose defenses like tool verification and input validation.

← Back to main profile