1

Catch'em all: Classification of Rare, Prominent, and Novel Malware Families

National security is threatened by malware, which remains one of the most dangerous and costly cyber threats. As of last year, researchers reported 1.3 billion known malware specimens, motivating the use of data-driven machine learning (ML) methods …

Electrical Grid Anomaly Detection via Tensor Decomposition

Supervisory Control and Data Acquisition (SCADA) systems often serve as the nervous system for substations within power grids. These systems facilitate real-time monitoring, data acquisition, control of equipment, and ensure smooth and efficient …

Interactive Distillation of Large Single-Topic Corpora of Scientific Papers

Highly specific datasets of scientific literature are important for both research and education. However, it is difficult to build such datasets at scale. A common approach is to build these datasets reductively by applying topic modeling on an …

MalwareDNA: Simultaneous Classification of Malware, Malware Families, and Novel Malware

Malware is one of the most dangerous and costly cyber threats to national security and a crucial factor in modern cyber-space. However, the adoption of machine learning (ML) based solutions against malware threats has been relatively slow. …

Malware-DNA: Machine Learning for Malware Analysis that Treats Malware as Mutations in the Software Genome

Malware is one of the most dangerous and costly cyber threats to organizations, the public, and national security, and a crucial factor in modern warfare. The adoption of ML-based solutions against malware threats has been relatively slow despite the …