Supervising Students

    I'm generally excited to supervise any student projects at Cambridge for the Part II, Part III, or MPhil.

    If you would like to discuss a potential project, please get in touch.

    Project Ideas

    I've collected a set of potential student projects ideas below:

    • Extend TextAttack with Imperceptible Perturbations - The TextAttack framework is a common tool used by researchers and practitioners to assess text-based machine learning for robustness against adversial examples. This framework does not currently support Imperceptible Perturbtations, which are a recent form of adversarial examples that can be used to target NLP systems without any visual artifacts. This project would be to implement Imperceptible Perturbation support in a fork of TextAttack, and then to merge that functionality back into the framework via a pull request. This works is most likely a good fit as a Part II project.
    • Algorithmize Trojan Source Attacks - Trojan Source attacks are a method of hiding adversarial logic within source code's encoding. Such attacks are "visible" to compilers but "invisible" to human developers. To date, research into Trojan Source attacks has taken the form of general patterns described anecdotally. In practice, this makes crafting Trojan Source attacks more of an art than a science. This project would be desiged to change that. The idea is to algorithmize these attacks. Done successfully, this project would produce two primary artifacts: (1) a theoretical algorithm that, given input source code, could produce well-crafted Trojan Source attacks that appears visually identical to that code, and (2) an implementation of this algorithm that demonstrates its efficacy against example programs. This project woud likely be a good fit for Part III or MPhil, or in a partial form as a Part II project.
    • LLM Vulnerability Detection - Traditionally automatic vulnerability detection in code occurs via static code analysis and fuzzing. These techniques tend to be very good at detecting specific classes of technical vulnerabilties such as use-after-free bugs. However, these techniques are less adept at detecting logical application errors, such as detecting whether the correct AuthN model has been used. The advent of large language models (LLMs) may offer a better way to detect these attacks. The purpose of this project would be to investigate whether LLMs can be used to identify logical vulnerabilities in application software implementations. A successful project would produce a literature review, a theoretical analysis of the problem space, and experiments evaluating LLMs' abilities to detect logical vulnerabilities. Excellent projects may choose to fine-tune models as part of experimentation and submit results to relevant security/ML conferences. This project would likely be a good fit for Part III / MPhil students.