Type-Based Decompilation
Author: Alan Mycroft
Title: Type-Based Decompilation
Abstract:
We describe a system which decompiles (reverse engineers)
C programs from target machine code by type-inference techniques.
This extends recent trends in the converse process of compiling high-level
languages whereby type information is preserved during compilation.
The algorithms remain independent of the particular architecture
by virtue of treating target instructions as register-transfer specifications.
Target code expressed in such RTL form is then transformed into
SSA form (undoing register colouring etc.); this then generates a set of
type constraints. Iteration and recursion over data-structures
causes synthesis of appropriate recursive C structs;
this is triggered by and resolves occurs-check constraint violation.
Other constraint violations are resolved by C's casts and unions.
In the limit we use heuristics to select between equally suitable C code---a
good GUI would clearly facilitate its professional use.
Published version:
Please note that copyright has now been transfered to Springer
(Copyright (C)
Springer-Verlag LNCS), so please
only use the above web page for personal use only and make bibliographic
references to
Mycroft, A.
Type-Based Decompilation.
Lecture Notes in Computer Science:
Proc.\ ESOP'99, vol.~1576 Springer-Verlag, 1999.