## Decompilation into Logic

### What?

By *decompilation into logic*, or simply *decompilation*, we
mean proof-certified translation of concrete machine code into
functional programs defined in the logic of a theorem prover.

### Why?

Decompilation is easily coded up as a tool, *a decompiler*, which
significantly simplifies verification of machine-code
programs. Instead of having to read and reason about machine code, the
verifier can, with the aid of a decompiler, reason about relatively
clean functional programs. Properties proved of these functional
programs carry over to the real machine code via theorems that the
decompiler proves when it extracts the functional programs: any
behaviour of the extracted functions is also the behaviour of the
machine code.

Decompilation can also be used to implement proof-producing synthesis tools and binary translators. Synthesis tools take as input a functional program and synthesise machine code which it proves has exactly the behaviour of the given input function. Binary translators take machine code for one architecture as input and produce machine code for a different architecture as output, with the output being guaranteed to have exactly the same behaviour of the original machine code.

### How?

We implement decompilation as a fully automatic proof tool within an interactive theorem prover (HOL4 in our case). The proofs it produces are based on detailed models of the instruction set architecture (ISA) of the machine language. We parametrise the decompiler by the underlying ISA model, i.e. the same decompiler can be used for multiple different ISAs.

Our initial approach to decompilation was based on automating proofs in a general-purpose Hoare logic for machine-code programs. Since then, we have refined the approach and made it significantly simpler, faster and more generally applicable. The new and improved version is described in the following paper.

Magnus O. Myreen, Michael J. C. Gordon and Konrad Slind.

Decompilation into Logic — Improved.

Submitted, 2012.

AbstractThis paper presents improvements to a technique which aids veriﬁcation of machine-code programs. This technique, called decompilation into logic, allows the veriﬁer to only deal with tractable extracted models of the machine code rather than the concrete code itself. Our improvements make decompilation simpler, faster and more generally applicable. In particular, the new technique allows the veriﬁer to always avoid tedious reasoning directly in the underlying machine-code Hoare logic or the model of the instruction set architecture. The method described in this paper has been implemented in the HOL4 theorem prover.

BenchmarksThe implementation that was used to produce the benchmarks presented in this paper are available in benchmarks.zip with instructions on how to build them.