论文标题
恶意程序的经验网络结构
Empirical Network Structure of Malicious Programs
论文作者
论文摘要
现代二进制可执行文件是各种网络的组成。控制流程图通常用于表示用于分类任务的标记数据集中的可执行程序。控制流量和术语表示形式被广泛采用,但仅提供了程序语义的部分视图。这项研究是对组成恶意二进制的网络的经验分析,以提供程序的结构特性的完整表示。这是通过在恶意二进制可执行数据集中测量程序网络的结构属性来实现的。我们证明了用于程序数据依赖性和控制流程图网络结构的无标度特性的存在,并证明数据依赖图也具有小世界的结构属性。我们表明,程序数据依赖图在结构上具有分解性的度相关性,并且控制流图具有中性度度分类性,表明使用随机图对程序控制流程图的结构特性进行建模将显示出更高的精度。通过在可执行程序的标记数据集中提供功能分辨率的增加,我们提供了定量基础来解释在CFG图形功能上训练的分类器的结果。功能分辨率的增加允许分析程序类的结构属性,以获取模式及其组件部分。通过捕获程序图的完整图片,我们可以启用理论解决方案,以将程序的操作语义映射到其结构上。
A modern binary executable is a composition of various networks. Control flow graphs are commonly used to represent an executable program in labeled datasets used for classification tasks. Control flow and term representations are widely adopted, but provide only a partial view of program semantics. This study is an empirical analysis of the networks composing malicious binaries in order to provide a complete representation of the structural properties of a program. This is accomplished by the measurement of structural properties of program networks in a malicious binary executable dataset. We demonstrate the presence of Scale-Free properties of network structure for program data dependency and control flow graphs, and show that data dependency graphs also have Small-World structural properties. We show that program data dependency graphs have a degree correlation that is structurally disassortative, and that control flow graphs have a neutral degree assortativity, indicating the use of random graphs to model the structural properties of program control flow graphs would show increased accuracy. By providing an increase in feature resolution within labeled datasets of executable programs we provide a quantitative basis to interpret the results of classifiers trained on CFG graph features. An increase in feature resolution allows for the structural properties of program classes to be analyzed for patterns as well as their component parts. By capturing a complete picture of program graphs we can enable theoretical solutions for the mapping a program's operational semantics to its structure.