Classifying Malware Using Tensor Decomposition

Abstract

Tensor decomposition is a powerful unsupervised machine learning technique capable of modeling multidimensional data, including that related to malware. This chapter discusses a method that employs tensor decomposition for malware analysis. We introduce an innovative ensemble semi-supervised classification algorithm named Random Forest of Tensors (RFoT). RFoT leverages tensor decomposition to extract intricate latent patterns from the data. Our hybrid model combines multidimensional analysis with clustering to capture sample groupings within latent components, aiding in distinguishing between malware and benign-ware. The patterns extracted from malware data using tensor decomposition heavily rely on the configuration of the tensor, including dimension, entry, and rank selection. To encompass diverse perspectives offered by different tensor configurations, we adopt the ``wisdom of crowds’’ philosophy. This involves leveraging decisions made by the majority within a randomly generated ensemble of tensors, varying in dimensions, entries, and ranks. We illustrate RFoT’s effectiveness in classifying Windows Portable Executable (PE) malware and benign-ware. To promote the utility of tensor decomposition for malware analysis and ensure the reproducibility of our results, we have made our code publicly available.

Publication
Chapter in Springer Nature book Malware; Handbook of Prevention and Detection, 2024

Keywords:

malware, tensors

Citation:

Eren, M.E., Alexandrov, B.S., Nicholas, C. (2025). Classifying Malware Using Tensor Decomposition. In: Gritzalis, D., Choo, KK.R., Patsakis, C. (eds) Malware. Advances in Information Security, vol 91. Springer, Cham. https://doi.org/10.1007/978-3-031-66245-4_1

BibTeX:

@Inbook{Eren2025,
author="Eren, Maksim E.
and Alexandrov, Boian S.
and Nicholas, Charles",
editor="Gritzalis, Dimitris
and Choo, Kim-Kwang Raymond
and Patsakis, Constantinos",
title="Classifying Malware Using Tensor Decomposition",
bookTitle="Malware: Handbook of Prevention and Detection",
year="2025",
publisher="Springer Nature Switzerland",
address="Cham",
pages="3--36",
abstract="Tensor decomposition is a powerful unsupervised machine learning technique capable of modeling multidimensional data, including that related to malware. This chapter discusses a method that employs tensor decomposition for malware analysis. We introduce an innovative ensemble semi-supervised classification algorithm named Random Forest of Tensors (RFoT). RFoT leverages tensor decomposition to extract intricate latent patterns from the data. Our hybrid model combines multidimensional analysis with clustering to capture sample groupings within latent components, aiding in distinguishing between malware and benign-ware. The patterns extracted from malware data using tensor decomposition heavily rely on the configuration of the tensor, including dimension, entry, and rank selection. To encompass diverse perspectives offered by different tensor configurations, we adopt the ``wisdom of crowds'' philosophy. This involves leveraging decisions made by the majority within a randomly generated ensemble of tensors, varying in dimensions, entries, and ranks. We illustrate RFoT's effectiveness in classifying Windows Portable Executable (PE) malware and benign-ware. To promote the utility of tensor decomposition for malware analysis and ensure the reproducibility of our results, we have made our code publicly available.",
isbn="978-3-031-66245-4",
doi="10.1007/978-3-031-66245-4_1",
url="https://doi.org/10.1007/978-3-031-66245-4_1"
}
Maksim E. Eren
Maksim E. Eren
Scientist

My research interests lie at the intersection of the machine learning and cybersecurity disciplines, with a concentration in tensor decomposition.