Type-Based Decompilation

Author: Alan Mycroft
Title: Type-Based Decompilation
Abstract: We describe a system which decompiles (reverse engineers) C programs from target machine code by type-inference techniques. This extends recent trends in the converse process of compiling high-level languages whereby type information is preserved during compilation. The algorithms remain independent of the particular architecture by virtue of treating target instructions as register-transfer specifications. Target code expressed in such RTL form is then transformed into SSA form (undoing register colouring etc.); this then generates a set of type constraints. Iteration and recursion over data-structures causes synthesis of appropriate recursive C structs; this is triggered by and resolves occurs-check constraint violation. Other constraint violations are resolved by C's casts and unions. In the limit we use heuristics to select between equally suitable C code---a good GUI would clearly facilitate its professional use.
Published version: Please note that copyright has now been transfered to Springer (Copyright (C) Springer-Verlag LNCS), so please only use the above web page for personal use only and make bibliographic references to
Mycroft, A. Type-Based Decompilation. Lecture Notes in Computer Science: Proc.\ ESOP'99, vol.~1576 Springer-Verlag, 1999.