Network Semantics
Network programming is complex: one has to deal with a variety of
protocols (IP, ICMP, UDP, TCP etc), concurrency, packet loss, host
failure, timeouts, the Sockets API for the protocols, and
subtle portability issues.
The protocols are typically described in RFCs using informal prose and
pseudocode to characterise the behaviour of the systems involved.
That informality has benefits, but inevitably these descriptions
are somewhat ambiguous and incomplete.
The protocols are hard to design and implement correctly; testing conformance
against the standards is challenging; and understanding the many obscure
corner cases and the failure semantics requires considerable expertise.
Ideally we would have the best of both worlds:
protocol descriptions that
are simultaneously:
- clear, accessible to a broad community and easy to modify;
- unambiguous, characterising exactly what behaviour is specified;
- sufficiently loose, characterising exactly what is not specified,
and hence what is left to implementors (especially,
permitting high-performance implementations without over-constraining
their structure); and
- directly usable as a basis for conformance testing, not
read-and-forget documents.
In this work we have established a practical
technique for rigorous protocol specification, in HOL4,
that makes this ideal attainable for
protocols as complex as TCP.
We describe specification idioms that are rich enough to express the
subtleties of TCP endpoint behaviour and that scale to the full
protocol, all while remaining readable. We also describe novel
tools for automated conformance testing between specifications and
real-world implementations.
To develop the technique, and to demonstrate its feasibility, we have
produced a post-hoc specification of existing protocols: a
mathematically rigorous and experimentally-validated characterisation
of the behaviour of TCP, UDP, and the Sockets API, as implemented in
practice, at both protocol and service levels of abstraction.
The resulting specifications may be useful in their own right in several
ways. They have been extensively annotated
to make it usable as a reference for TCP/IP stack
implementors and Sockets API users, supplementing the existing
informal standards and texts. They can also provide a basis for
high-fidelity conformance testing of future implementations, and a
basis for design (and conceivably formal proof) of higher-level
communication layers.
Perhaps more significantly, the work
demonstrates that
it would be feasible to carry out
similar rigorous
specification work for new protocols, in
a tolerably light-weight style, both at design-time and during standardisation.
We believe the increased
clarity and precision over informal specifications, and the possibility of automated specification-based testing,
would make this very much worthwhile, leading to
clearer protocol designs and higher-quality implementations.
We discuss some simple ways
in which protocols could be
designed to make testing computationally straightforward.
This github
repository contains an updated version of the original protocol-level
specification (described in Vol. 1 and 2 below) and revised tools, updated by Hannes Mehnert
2015-2019.
Overview Paper
This paper gives a overall view of our work on TCP, UDP, and the Sockets API:
(it does not cover the ICNP 06, POPL 09, or ATS 05 papers).
Protocol-Level TCP Specification
A theory-oriented description of the specification idioms and symbolic
model-checking technology used for the protocol-level specification of
TCP:
A systems-oriented introduction to the protocol-level TCP
specification work:
A full description of the project as of 2005, and the protocol-level specification itself
(typeset and extensively annotated):
-
TCP, UDP, and Sockets: rigorous and experimentally-validated
behavioural specification. Volume 1: Overview.
Steve Bishop, Matthew Fairbairn, Michael Norrish,
Peter Sewell, Michael Smith, Keith Wansbrough. 88pp. Technical Report 624. March 2005.
Also in pdf.
-
TCP, UDP, and Sockets: rigorous and experimentally-validated
behavioural specification. Volume 2: The Specification.
Steve Bishop, Matthew Fairbairn, Michael Norrish,
Peter Sewell, Michael Smith, Keith Wansbrough. 386pp. Technical Report 625. March 2005.
Also in pdf (with internal hyperlinks)
and in ps 2up (good for printing).
-
Errata for the spec.
A short FAQ-style introduction:
A poster with a version of the `TCP state diagram' extracted from the
specification. This is rather more complete than the
usual diagram,
but is still a very abstract and simplified view of the specification
state space. It is intended to be printed at A1 size or bigger:
Peng Li, a PhD student
with Steve Zdancewic,
used the specification as the basis for a purely functional Haskell
implementation of a TCP stack for a web server. Here is
Service-Level TCP Specification
A high-level specification of the service provided by TCP:
The service-level specification itself:
HOL Sources for Protocol-level and Service-Level TCP Specifications
A tarball of the HOL source files of the protocol-level and
service-level specifications:
Applying these techniques at protocol design time
Design-time specification and validation work on a MAC protocol for optically switched networking:
Reasoning above protocol specifications
A demonstration verification of distributed infrastructure, involving
networking, a filesystem, and concurrent OCaml code:
The following demonstrates the feasibility of completely formal
reasoning (in the Isabelle proof assistant) about executable code
in a fragment of OCaml above the UDP specification:
Earlier work on UDP, ICMP, and Sockets
Here are papers about our earlier work focussing just on UDP, ICMP, and the
Sockets API. The first technical report and TACS paper describe a
model without time, threads, or modules, and using informal
mathematics. The ESOP paper reports on a HOL specification extended
to cover those three aspects. The SIGOPS EW paper is a position paper
reflecting on the experience of this and of Norrish's C semantics work.
-
The UDP Calculus: Rigorous Semantics for Real
Networking.
Technical report 515. Andrei Serjantov, Peter Sewell, and Keith
Wansbrough. iv+70pp. July 2001.
Also in ps.gz and pdf.
-
The UDP Calculus: Rigorous Semantics for Real Networking. Andrei Serjantov, Peter Sewell, and Keith Wansbrough.
In
TACS
2001, LNCS 2215, 535-559.
Also in ps.gz.
-
Timing UDP: mechanized semantics for sockets, threads and failures.
Keith Wansbrough, Michael Norrish, Peter Sewell, Andrei Serjantov.
In ESOP 2002, LNCS 2305, 278-294.
Also in ps.gz and pdf.
-
Timing UDP: the HOL model.
Here are the HOL scripts and a typeset version of the timed host
semantics described in the paper above.
-
Rigour is good for you and feasible:
reflections on formal treatments of
C and UDP sockets.
Michael Norrish, Peter Sewell, Keith Wansbrough.
In SIGOPS EW 2002, 49-53.
Also in ps.gz.
A note written to aid the development of the TCP specification,
describing some of the behaviour of the Linux implementation:
People
Acknowledgements
This work has been funded by
the EPSRC REMS Programme Grant EP/K008528/1,
an EPSRC Leadership Fellowship and a Royal Society University Research
Fellowship (Sewell),
a St Catharine's College Heller Research
Fellowship (Wansbrough),
EPSRC grant GR/N24872 Wide-area programming: Language, Semantics
and Infrastructure Design,
EPSRC grant EP/C510712 NETSEM: Rigorous Semantics for Real Systems,
EC FET-GC project IST-2001-33234 PEPITO
Peer-to-Peer Computing: Implementation and Theory,
CMI UROP internship support (Smith), and
EC Thematic Network IST-2001-38957 APPSEM 2.
National ICT Australia is funded by the Australian Government's
Backing Australia's Ability initiative, in part through the
Australian Research Council.
Peter.Sewell@cl.cam.ac.uk
[Validate this page.]