Unity in diversity: Phylogenetic-inspired techniques for reverse engineering and detection of malware families
Abstract
We developed a framework for abstracting, aligning and analysing malware execution traces and performed a preliminary exploration of state of the art phylogenetic methods, whose strengths lie in pattern recognition and visualisation, to derive the statistical relationships within two contemporary malware families. We made use of phylogenetic trees and networks, motifs, logos, composition biases, and tree topology comparison methods with the objective of identifying common functionality and studying sources of variation in related samples. Networks were more useful for visualising short nop-equivalent code metamorphism than trees; tree topology comparison was suited for studying variations in multiple sets of homologous procedures. We found logos could be used for code normalisation, which resulted in 33% to 62% reduction in the number of instructions. A motif search showed that API sequences related to the management of memory, I/O, libraries and threading do not change significantly amongst malware variants; composition bias provided an efficient way to distinguish between families. Using context-sensitive procedure analysis, we found that 100% of a set of memory management procedures used by the FakeAV-DO and “Skyhoo” malware families were uniquely identifiable. We discuss how phylogenetic techniques can aid the reverse engineering and detection of malware families and describe some related challenges. pdf slides
@inproceedings{author = {Wei Ming Khoo and Pietro Lio},
title = {Unity in diversity: Phylogenetic-inspired techniques for reverse engineering and detection of malware families},
booktitle = {1st SysSec Workshop},
year = {2011}
}
Contact Information
Wei Ming Khoo
University of Cambridge
Computer Laboratory
15 JJ Thomson Avenue
Cambridge CB3 0FD
United Kingdom
wmk26[AT]cam[DOT]ac[DOT]uk