Malware Analysis using Binary Visualization based on Byte Entropies and Mnemonic Sequences
- Malware Analysis using Binary Visualization based on Byte Entropies and Mnemonic Sequences
- Issue Date
- Today, along with the development of the Internet, the number of malicious software, or malware, written and distributed, particularly for monetary profits, is exponentially increasing. At the same time, malware authors are generating new malware and malware variants through various means, such as reusing modules or using automated malware generation tools. These malware variants that have been written using reused modules have similarities among them and can form groups or families. Therefore, potential similarities among malware variants can be analyzed and used for malware variant detections and malware family classifications. However, obfuscation or packing techniques are applied to malware generation to avoid or disturb the static analysis methods. As a result, compared to the generation and propagation velocity of the malware, its ripostes are lagging behind. Furthermore, the direct and secondary damages caused by malware infections are increasing as well. Therefore, new malware analysis techniques are needed to reduce malware analysis overheads. For this purpose, several malware visualization methods that visually represent the features extracted from the malware have recently been proposed to help security analysts.
In this dissertation, novel methods to visually analyze malware are proposed, and the proposed methods are implemented as a visual analysis tool. This tool allows security analysts and researchers to visually analyze malware binary files. In addition, the tool rapidly classifies the malware families by automatically calculating similarities between the visualized images.
The first visualization method is a byte entropy visualization method that transforms the byte information extracted from malware binary files into entropy graphs. That is, the method generates entropy graphs by using the entropies obtained from the specific length of the consecutive byte values in malware binary files.
The second visualization method is a block information visualization method that transforms basic blocks extracted through the disassembling of malware binary files into image matrices. Basic blocks can also be extracted from dynamic execution traces when the static analysis is not available due to obfuscation or packing techniques in malware binary files. The block information used in this dissertation is the mnemonic sequences including CALL instruction.
The visual images that are generated from malware binary files or dynamic execution traces can be used to analyze the relationships among those malware through similarity analyses. In addition, similarity analysis results can be used in malware classifications. The experimental results in this dissertation show that the visual images of malware binary files can be used to classify malware families with more than 22% improvement in speed and more than 94% accuracy.
- Appears in Collections:
- GRADUATE SCHOOL[S](대학원) > COMPUTER SCIENCE(컴퓨터·소프트웨어학과) > Theses (Ph.D.)
- Files in This Item:
There are no files associated with this item.
- RIS (EndNote)
- XLS (Excel)