Computer Laboratory Home Page Search A-Z Directory Help
University of Cambridge Home Computer Laboratory::Timothy Griffin
Part II Projects

These projects involve the use of Neo4j, a graph-oriented database in the No-SQL family of databases. Neo4j is very easy to install and configure, and comes with a very expressive query language, cypher.

  • Coq4j: A Neo4j database for Coq dependency graphs. The Coq Proof Assistant is a widely used system for interactively developing formal proofs in a logic based on dependent types.

    The existence of large bodies of formalised computer science and mathematics gives rise to the opportunity to treat these libraries as objects of study in themselves. That is, we can explore the structure of mathematical theories using the same kinds of algorithms we might use to explore a social network graph.

    One example is coq-dpdgraph, a tool that generates directed graphs to capture the dependencies between coq objects (definitions, lemmas, proofs). Currently this can be used to generate static graphs that can be inspected with dot. This is very useful, but as theories get large it quickly becomes difficult to gain useful insights from enormous complex graphs.

    The Coq4j project aims to go a step further. This project will develop code to translate any graph generated by coq-dpdgraph to a Neo4j database. We can then use the cypher query language to interactively explore the structure of the associated theory. We will test this approach on a number of existing coq theories and develop a library of cypher queries that are especially tailored to exploring the structure of mathematical theories. We might call this empirical meta-mathematics.

    Our moonshot experiment will be an attempt to apply this framework to the formalisation of the Odd Order Theorem, a tour de force coq development comprised of over 15K definitions and 4K theorems.

    No previous knowledge of Coq, Neo4j, or group theory will be assumed.

  • BGP4j: A Neo4j database for analysis of inter-domain routing in the Internet The Border Gateway Protocol (BGP) is the routing protocol used to glue together all of the autonomous networks in the global Internet to provide one large interconnected network.

    There are several public data archives that have recorded all bgp updates at various internet exchange points (see for example RIS raw data ). These archives contain BGP updates and table dumps in a binary format which can be parsed with tools such as bgpdump and bgpreader.

    This project will transform bgp archives into Neo4j databases. It will develop a graph-oriented data model that captures BGP relationships between networks (Autonomous Systems) as well as some of the rich temporal data contained in the BGP update archives. The project will develop a library of cypher queries that reveal some of the static and dynamic information locked up in these archives.