Guaranteeing the correctness of implementations is not straightforward. There are various standards that specify a combination of design process and verification technology for achieving a spectrum of degrees of assurance. The highest assurance requires some use of formal methods. For example, security level 4 of the Federal Information Processing Standard (FIPS 140-2) and Evaluation Assurance Level 7 (EAL7) of the Common Criteria (CC) require some use of formal specification and proof.

Both FIPS Level 4 and CC EAL7 are intended to be achievable by current commercial contractors and are weaker than the state-of-the-art in research. In particular, neither requires formal verification of the "low-level design", i.e. of the implementation that is actually deployed. An even higher level of assurance would require formally verified implementations, but using current implementation technologies this is impractical. The reason is that industry standard high-level programming languages are too complex to have tractable formal definitions, and thus it is impossible to rigorously prove properties about designs expressed in them. The best that has been done so far is to:

- use a tractable subset of a real language (like SPARKAda), but then one has to be confident that the semantics of the subset
is consistent with the semantics of the full language (which will have to be an informal judgement);
- reason directly about really low level programs, like JVM or MC68020 instructions, which can be given an accurate formal semantics via a formal model of the machine on which they run.

What is needed is a high-level programming language with a tractable semantics (i.e. a semantics that supports non-trivial reasoning) together with an implementation method guaranteeing that the semantics corresponds exactly to the actual behaviour when programs are run. Existing high-level languages have neither a tractable semantics nor formally verified implementations.

Some languages do have a formal semantics, notably the denotational semantics of Scheme and the celebrated formal operational semantics of ML, but I think it is fair to say that these semantics do not provide a practical basis for formal reasoning about programs in the languages. They are, however, extremely valuable as reference documents and for proving meta-theorems (like the type preservation for ML).

A grand challenge would be to design a high level programming language that realistically competes with C++, C# and Java for implementating industrial strength systems, and which has (i) a tractable semantics that supports formal verification of non-trivial programs, (ii) a highly assured implementation guaranteeing that the semantics is correct, and (iii) a value proposition that ensures it gets used in the real world. This challenge is too grand -- it is probably impossible.

Even if the ultimate goal is impossible, then it can still be valuable to work on partial solutions. An analogy: many properties of computer programs are undecidable, but building decision procedures for useful domains is an active and important research area, with many practical applications.

Logic is manifestly a language that has maximum tractability for formal reasoning. We propose a challenge that is much less ambitious than the grand challenge mentioned above, but is nevertheless pretty challenging.

Program some practically useful systems directly in logic,
and implement logic in a way
that guarantees the program's execution is correct?

This challenge can be split into three parts:

- choose and solve a practical problem by programming in logic;
- run the program in a way that assures correctness;
- provide a convincing value proposition that 1 and 2 are useful.

There are at least two ways one might execute logic programs: perform a verified compilation to an assured execution platform, or use a theorem prover to obtain the results of each program by formal deduction. The first of these is verified compilation and the second is verified interpretation. Each may well have its uses.

There already are verified implementation platforms, such as Piton and ARM610, that could be the target for verified compilation. The CLI Stack provides a verified compiler for a higher level language, micro-Gypsy. This work comes closest to already achieving our challenge, and J Moore's own Grand Challenge has many similarities with what is proposed here.

Two ways one might perform verified interpretation by theorem proving are: (i) use a uniform proof procedure, such as rewriting and beta-reduction for functions and resolution for relations, and (ii) write a proof script in a meta-language to execute via domain-specific deduction. The first of these is how traditional logic programming works, the second is how the LCF system of Robin Milner works.

Sun Mar 27 2005