Ugarte-Pedrero X.,S3 Laboratory |
Ugarte-Pedrero X.,University of Deusto |
Santos I.,S3 Laboratory |
Santos I.,University of Deusto |
And 6 more authors.
Computer Systems Science and Engineering | Year: 2013
Malware writers employ packing techniques (i.e., encrypt the real payload) to hide the actual code of their creations. Generic unpacking techniques execute the binary within an isolated environment (namely 'sandbox') to gather the real code of the packed executable. However, this approach can be very time consuming. A common approach is to apply a filtering step to avoid the execution of not packed binaries. To this end, supervised machine learning models trained with static features from the exécutables have been proposed. Notwithstanding, these methods need the identification and labelling of a high number of packed and not packed executables. In this paper, we propose a new method for packed executable detection that adopts collective learning approaches (a kind of semi-supervised learning) to reduce the labelling requirements of completely supervised approaches. We performed an empirical validation demonstrating that the system maintains a high accuracy rate when the number of labelled instances in the dataset is lower. © 2013 CRL Publishing Ltd.