Malware Antivirus Scan Pattern Mining via Tensor Decomposition

Abstract

Accurate labeling is important for detecting malware and building reference datasets which can be used for evaluating machine learning (ML) based malware classification and clustering approaches. Labels obtained from Anti-Virus (AV) vendors (such as Kaspersky, Malwarebytes, and McAfee) are one source of information; however, despite ongoing research efforts there is still inconsistency with the labeling across AV vendors. AV vendors use differing formats and naming conventions when reporting labels of malware samples, and the reported labels between any two vendors can disagree. We address this problem in our work utilizing CP-APR, a powerful tensor decomposition method for unsupervised ML, to discover the hidden patterns across AV vendors in the way they report the malware labels. In comparison to the traditional ML methods, tensor decomposition models the multi-dimensional properties of the data and produces interpretable results. The higher-dimensional representation of the AV scans enables the discovery of multi-faceted and complex details of those scans.

Publication
Presented at the 13th Annual Malware Technical Exchange Meeting, Online, 2022

Keywords:

Tensors, Machine Learning, AV, Malware

Citation:

Bhandary, P, Vieson, C., Kiendrebeogo, A., Adetunji, I., Joyce, R., Eren, M. E., and Nicholas, C. (2022). Malware Antivirus Scan Pattern Mining via Tensor Decomposition. Presented at the 13th Annual Malware Technical Exchange Meeting, Online, 2022.

BibTeX:

@misc{Bhandary2022MTEM,
      title={Malware Antivirus Scan Pattern Mining via Tensor Decomposition}, 
      author={P. {Bhandary} and C. {Vieson} and A. {Kiendrebeogo} and I. {Adetunji} and R. {Joyce} and M. E. {Eren} and C. {Nicholas}},
      year={2022},
      note={Presented at the 13th Annual Malware Technical Exchange Meeting, Online, 2022}
}
Maksim E. Eren
Maksim E. Eren
Scientist

My research interests lie at the intersection of the machine learning and cybersecurity disciplines, with a concentration in tensor decomposition.