4  Task 4a: System APIs and Protocols

4.1  POSIX Filesystem Specification and Reasoning

We have pursued two directions here: operational modelling, validated by extensive testing against production filesystem implementations, and program-logic work oriented towards reasoning about filesystems.

Operational Modelling

SibylFS is a semantics and filesystem test oracle for POSIX filesystems, developed in a collaboration between semantics and system researchers led by Ridge at Leicester [81]. SibylFS has been used to validate and identify bugs in existing production filesystems and in two verified filesystems (FSCQ and Flashix) from academia. The SibylFS infrastructure was used by Docker, following its acquisition of Unikernel Systems (a spin-out from Cambridge). The SibylFS model has been integrated into our Task 3 Cerberus C semantics, providing an integrated semantic model for C programs that use a filesystem. SibylFS is available under a permissive open-source licence.


SibylFS web page: http://sibylfs.github.io/
SibylFS github repo: https://github.com/sibylfs/

Program-logic reasoning

In parallel, we have developed a separation-logic style program logic for a sequential specification of a core subset of POSIX file systems and for modular reasoning about client programs using file systems.

We developed a separation logic style program logic for path-addressable structured data, and applied this logic to develop a sequential specification of a core subset of POSIX file systems. This work demonstrated the modularity and scalability of reasoning about file-system client programs using simplified paths that allow updates to appear local [86]. We then developed fusion logic, a file-system specific program logic based on separation logic, for reasoning about non-local properties of arbitrary paths using symbolic links or dot-dots. We extended the previous sequential specification of core POSIX file systems to arbitrary paths. The extended specification and fusion logic were then used to reason about implementations of the rm utility, discovering bugs in popular implementations due to mishandling of symbolic links and dot-dots in paths [84].

We have extended concurrent program logics with reasoning about how programs are affected by, and recover from, host failures (crashes), as in file systems and databases [85]. This extends the Views framework, which acts as a basis for many concurrent program logics, with host failure and recovery semantics; we instantiated this with a Fault-tolerant Concurrent Separation Logic and used it to verify properties of an ARIES recovery algorithm used in databases.

4.2  TLS Security

In another collaboration between our systems and semantics groups, we have developed a clean-slate specification and implementation, nqsb-TLS, of the Transport Layer Security (TLS) protocol. This is fundamental to internet use, underlying https secure web access and ssh interaction, but its implementations have a history of security flaws [77, 76]. This is available under a permissive open-source licence. We engaged in the TLS 1.3 design process [75], providing extensive testing facilities for other implementations; and we presented an outreach talk at 31C3 [78].

We developed libnqsb-tls1 [74], a drop-in replacement for the C libtls library that wraps our pure OCaml TLS implementation ocaml-tls. We have also demonstrated unmodified OpenBSD system software running against our replacement library.


nqsbTLS web page: https://nqsb.io/
nqsbTLS github repo: https://github.com/mirleft/ocaml-tls
libnqsb-tls1 github repo: https://github.com/mirleft/libnqsb-tls/.

4.3  TCP/IP

We have resurrected the NetSem network semantics from 2000–2009, which defined an executable-as-test-oracle specification for the TCP and UDP network protocols and their Sockets API, using HOL4. In collaboration with the FreeBSD project, and with the DARPA CADETS project at BAE/Cambridge/Memorial, we now collect traces from a current FreeBSD system using DTrace1; CADETS is extending DTrace to support this work. Exploiting DTrace lets us substantially simplify the instrumentation and to instrument in more places, allowing nondeterminism to be resolved early; together with Moore’s-law advances in checking speed, this let us produce a more usable test oracle. We have published a J.ACM journal paper on the entire NetSem project, including this [80].

The model and infrastructure are available under a permissive open-source licence.


NetSem web page: https://www.cl.cam.ac.uk/users/pes20/Netsem/
NetSem github repo: https://github.com/PeterSewell/netsem.