Optimization of code caves in malware binaries to evade machine learning detectors

Machine Learning (ML) techniques, especially Artificial Neural Networks, have been widely adopted as a tool for malware detection due to their high accuracy when classifying programs as benign or malicious. However, these techniques are vulnerable to Adversarial Examples (AEs), i.e., carefully crafted samples designed by an attacker to be misclassified by the target model. In this work, we propose a general method to produce AEs from existing malware, which is useful to increase the robustness of ML-based models. Our method dynamically introduces unused blocks (caves) in malware binaries, preserving their original functionality. Then, by using optimization techniques based on Genetic Algorithms, we determine the most adequate content to place in such code caves to achieve misclassification. We evaluate our model in a black-box setting with a well-known state-of-the-art architecture (MalConv), resulting in a successful evasion rate of 97.99 % from the 2k tested malware samples. Additionally, we successfully test the transferability of our proposal to commercial AV engines available at VirusTotal, showing a reduction in the detection rate for the crafted AEs. Finally, the obtained AEs are used to retrain the ML-based malware detector previously evaluated, showing an improve on its robustness.

Descripción

This research was supported by the Ministerio de Ciencia, Innovación y Universidades (Grant Refs. PGC2018-095322-B-C22 and PID2019-111429RB-C21), by the Region of Madrid grant CYNAMON-CM (P2018/TCS-4566), co-financed by European Structural Funds ESF and FEDER, and the Excellence Program EPUC3M17. The opinions, findings, conclusions, or recommendations expressed are those of the authors and do not necessarily reflect those of any of the funders.

Palabras clave

Malware , Evasion , Machine learning , Adversarial example , Genetic algorithm

Citación

Javier Yuste, Eduardo G. Pardo, Juan Tapiador, Optimization of code caves in malware binaries to evade machine learning detectors, Computers & Security, Volume 116, 2022, 102643, ISSN 0167-4048, https://doi.org/10.1016/j.cose.2022.102643

Colecciones

Artículos de Revista

Página completa del ítem

Excepto si se señala otra cosa, la licencia del ítem se describe como Atribución 4.0 Internacional

Optimization of code caves in malware binaries to evade machine learning detectors

Archivos

Fecha

Autores

Título de la revista

ISSN de la revista

Título del volumen

Editor

Enlace externo

URI

DOI

Resumen

Descripción

Palabras clave

Citación

Colecciones