Empirical Network Structure of Malicious Programs

Original Research (Published On: 20-Feb-2024 )

DOI : https://dx.doi.org/10.54364/AAIML.2024.41112

John Musgrave, Alina Campan, Temesguen Messay-Kebede and David Kapp

Adv. Artif. Intell. Mach. Learn., 4 (1):1959-1976

1. John Musgrave: University of Cincinnati

2. Alina Campan: Northern Kentucky University

3. Temesguen Messay-Kebede: Air Force Research Lab, Wright-Patterson Air Force Base

4. David Kapp: Air Force Research Lab, Wright-Patterson Air Force Base

Download PDF Here

DOI: https://dx.doi.org/10.54364/AAIML.2024.41112

Article History: Received on: 20-Jan-24, Accepted on: 07-Feb-24, Published on: 20-Feb-24

Corresponding Author: John Musgrave

Email: musgrajw@mail.uc.edu

Citation: John Musgrave, Alina Campan, Temesguen Messay-Kebede, David Kapp, Boyang Wang (2024). Empirical Network Structure of Malicious Programs. Adv. Artif. Intell. Mach. Learn., 4 (1 ):1959-1976

Abstract

A modern binary executable is a composition of various types of networks. Control flow graphs are a commonly used representation of an executable program used for classification tasks. Control flow and term frequency representations are widely adopted, but provide only a partial view of program semantics and present challenges to increases in resolution. By performing a quantitative analysis of program networks, we enable the identification of patterns within these features that are correlated to structure. This allows for increases in feature resolution and pattern recognition in classification tasks. These are necessary steps in order to obtain greater explainability in classification results. We demonstrate the presence of Scale-Free properties of network structure for program data dependency and control flow graphs, and show that data dependency graphs also have Small-World structural properties. We show that program data dependency graphs have a degree correlation that is structurally disassortative, and that control flow graphs have a neutral degree assortativity, indicating the use of random graphs to model the structural properties of program control flow graphs would show increased accuracy. An increase in feature resolution allows for the structural properties of program classes to be analyzed for patterns as well as their component parts. By providing an increase in feature resolution within labeled datasets of executable programs we provide a quantitative basis to interpret the results of classifiers trained on CFG graph features. By capturing a complete picture of program networks we can enable future work in mapping a program's operational semantics to its structure.

Statistics

Article View: 805
PDF Downloaded: 11

Empirical Network Structure of Malicious Programs

Original Research (Published On: 20-Feb-2024 )

Abstract

Statistics

Other Journals

Site Links

Other Usefull Links

Publisher

Editor in Chief