Show simple item record

Optimization of code caves in malware binaries to evade machine learning detectors

dc.contributor.authorYuste, Javier
dc.contributor.authorGarcía Pardo, Eduardo
dc.contributor.authorTapiador, Juan
dc.date.accessioned2023-09-20T10:07:11Z
dc.date.available2023-09-20T10:07:11Z
dc.date.issued2022
dc.identifier.citationJavier Yuste, Eduardo G. Pardo, Juan Tapiador, Optimization of code caves in malware binaries to evade machine learning detectors, Computers & Security, Volume 116, 2022, 102643, ISSN 0167-4048, https://doi.org/10.1016/j.cose.2022.102643es
dc.identifier.issn0167-4048
dc.identifier.urihttps://hdl.handle.net/10115/24407
dc.descriptionThis research was supported by the Ministerio de Ciencia, Innovación y Universidades (Grant Refs. PGC2018-095322-B-C22 and PID2019-111429RB-C21), by the Region of Madrid grant CYNAMON-CM (P2018/TCS-4566), co-financed by European Structural Funds ESF and FEDER, and the Excellence Program EPUC3M17. The opinions, findings, conclusions, or recommendations expressed are those of the authors and do not necessarily reflect those of any of the funders.es
dc.description.abstractMachine Learning (ML) techniques, especially Artificial Neural Networks, have been widely adopted as a tool for malware detection due to their high accuracy when classifying programs as benign or malicious. However, these techniques are vulnerable to Adversarial Examples (AEs), i.e., carefully crafted samples designed by an attacker to be misclassified by the target model. In this work, we propose a general method to produce AEs from existing malware, which is useful to increase the robustness of ML-based models. Our method dynamically introduces unused blocks (caves) in malware binaries, preserving their original functionality. Then, by using optimization techniques based on Genetic Algorithms, we determine the most adequate content to place in such code caves to achieve misclassification. We evaluate our model in a black-box setting with a well-known state-of-the-art architecture (MalConv), resulting in a successful evasion rate of 97.99 % from the 2k tested malware samples. Additionally, we successfully test the transferability of our proposal to commercial AV engines available at VirusTotal, showing a reduction in the detection rate for the crafted AEs. Finally, the obtained AEs are used to retrain the ML-based malware detector previously evaluated, showing an improve on its robustness.es
dc.language.isoenges
dc.publisherElsevieres
dc.rightsAtribución 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectMalwarees
dc.subjectEvasiones
dc.subjectMachine learninges
dc.subjectAdversarial examplees
dc.subjectGenetic algorithmes
dc.titleOptimization of code caves in malware binaries to evade machine learning detectorses
dc.typeinfo:eu-repo/semantics/articlees
dc.identifier.doi10.1016/j.cose.2022.102643es
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses


Files in this item

This item appears in the following Collection(s)

Show simple item record

Atribución 4.0 InternacionalExcept where otherwise noted, this item's license is described as Atribución 4.0 Internacional