Papers similar to 2606.04171v1

~ similar to 2606.04171v1· 19 results

cs.CRcs.AIcs.LGRecentMay 11, 2026

MambaNetBurst: Direct Byte-level Network Traffic Classification without Tokenization or Pretraining

Gayan K. Kulatilleke, Siamak Layeghy, Mahsa Baktashmotlagh, Marius Portmann

MambaNetBurst introduces a compact, tokenizer-free byte-level classifier using a Mamba-2 backbone to achieve strong network traffic classification without requiring pre-training or complex data prepro…

View →

cs.CRRecentMay 29, 2026

When Entropy Is Not Enough: Multi-Modal Classification of Encrypted and Compressed Data Fragments

Fabio De Gaspari, Dorjan Hitaj, Samuele Salaris, Luigi V. Mancini

The paper proposes Triumvir, a multi-modal ensemble architecture that significantly improves the classification of small, raw data fragments to distinguish between encrypted and compressed data, outpe…

View →

cs.CRRecentMay 15, 2026

MalwarePT: A Binary-Level Foundation Model for Malware Analysis

Saastha Vasan, Yuzhou Nie, Kaie Chen, Yigitcan Kaya +5 more

MalwarePT introduces a novel binary-level foundation model, pretrained on Windows PE code-section bytes using a ModernBERT-style encoder, demonstrating superior transfer learning capabilities across v…

View →

cs.CRcs.AIRecentJun 1, 2026

Large Byte Model: Teaching Language Models About Compiled Code

Florian Störtz, Catalin-Andrei Stan, Alexandru Dinu, Sandra Servia-Rodríguez +3 more

The paper introduces the first byte-native Large Language Model (LLM) capable of analyzing raw executable binary data, achieving high accuracy in tasks like malware and architecture classification.

View →

cs.CRRecentApr 25, 2026

AsmRAG: LLM-Driven Malware Detection by Retrieving Functionally Similar Assembly Code

ElMouatez Billah Karbab

AsmRAG is a novel framework that improves malware detection by treating it as an evidence-based retrieval task using a code-specialized LLM, achieving high accuracy while providing transparent forensi…

View →

cs.CRRecentApr 22, 2026

Image-Based Malware Type Classification on MalNet-Image Tiny: Effects of Multi-Scale Fusion, Transfer Learning, Data Augmentation, and Schedule-Free Optimization

Ahmed A. Abouelkhaire, Waleed A. Yousef, Issa Traor

The paper investigates improving 43-class malware type classification on MalNet-Image Tiny by evaluating the combined effects of multi-scale feature fusion, transfer learning, advanced data augmentati…

View →

cs.CRcs.LGRecentMay 18, 2026

Learning to Look Benign: Targeted Evasion of Malware Detectors via API Import Injection

Juozas Dautartas, Olga Kurasova, Juozapas Rokas Čypas, Viktor Medvedev

The paper proposes a framework to intentionally evade malware detectors by adding a small number of benign API imports, successfully demonstrating targeted misclassification into a chosen benign categ…

View →

cs.CRcs.CLRecentJun 4, 2026

An Embarrassingly Simple Detector for Model Extraction Attacks in Large Language Model API Traffic

Shuze Liu, Qianwen Guo, Yushun Dong

The paper proposes an embarrassingly simple detector that monitors model extraction attacks by testing whether the aggregate distribution of incoming LLM queries deviates from the historical distribut…

View →

cs.CRcs.LGRecentApr 24, 2026

Adversarial Malware Generation in Linux ELF Binaries via Semantic-Preserving Transformations

Lukáš Hrdonka, Martin Jureček

This paper addresses the lack of research on adversarial malware generation for Linux ELF binaries by developing a new semantic-preserving generator that achieves a high evasion rate against modern de…

View →

cs.CRcs.AIEmpiricalRecentJun 29, 2026

A Multi-task Mixture of Experts Framework for Malware Classification, Packing Detection, and Family Attribution

Jithin S., Roshin Sleeba C., Anvin Mariya P. B., Asmitha K. A. +3 more

A unified multi-task malware analysis framework based on Mixture of Experts (MoE) architectures is proposed for malware family classification, packed versus unpacked detection, and malware versus beni…

View →

cs.CLcs.AIcs.CRRecentApr 6, 2026

XMark: Reliable Multi-Bit Watermarking for LLM-Generated Texts

Jiahao Xu, Rui Hu, Olivera Kotevska, Zikai Zhang

XMark introduces a novel multi-bit watermarking technique that reliably embeds binary messages into LLM-generated text while maintaining high text quality and robust performance even with limited toke…

View →

cs.CRRecentApr 17, 2026

DEMUX: Boundary-Aware Multi-Scale Traffic Demixing for Multi-Tab Website Fingerprinting

Yali Yuan, Yaosheng Liu, Qianqi Niu, Guang Cheng

DEMUX is a novel framework that addresses the challenge of multi-tab website fingerprinting by treating the interleaved traffic as a demixing problem, achieving state-of-the-art performance in complex…

View →

cs.CRRecentMar 30, 2026

Safeguarding LLMs Against Misuse and AI-Driven Malware Using Steganographic Canaries

Md Raz, Venkata Sai Charan Putrevu, Meet Udeshi, Prashanth Krishnamurthy +2 more

The paper introduces a novel framework using steganographic canary files to detect and block unauthorized processing of sensitive documents by LLMs, even when the data passes through traditional secur…

View →

cs.CVcs.AIcs.CLRecentJun 1, 2026

Jailbreaking Multimodal Large Language Models using Multi-Clip Video

Choongwon Kang, Seungjong Sun, Hyunmin Jun, Jang Hyun Kim

The paper introduces Multi-Clip Video (MCV) SafetyBench, a dataset demonstrating that the vulnerability of Multimodal Large Language Models (MLLMs) to jailbreaking increases with the diversity and num…

View →

cs.CLRecentMay 28, 2026

Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs

Zhihao Wu, Gracia Gong, Qinglin Zhu, Yudong Chen +1 more

The paper demonstrates that combining outputs from multiple large language models (LLMs) effectively cancels out statistical watermarks, revealing a fundamental vulnerability in current AI text detect…

View →

cs.LGcs.CRRecentMar 30, 2026

Label-efficient Training Updates for Malware Detection over Time

Luca Minnei, Cristian Manca, Giorgio Piras, Angelo Sotgiu +5 more

The paper proposes a model-agnostic framework to evaluate combining Active Learning (AL) and Semi-Supervised Learning (SSL) techniques for malware detection, demonstrating that these combined methods…

View →

cs.CRcs.ARcs.LGRecentApr 25, 2026

Tessera: Secure, Near-Line-Rate Weight Streaming for UMA Edge Accelerators

Animan Naskar

Tessera introduces a novel hardware architecture that achieves secure, near-line-rate weight streaming for DNNs on UMA edge accelerators by performing cache-line granularity decryption during DRAM fet…

View →

cs.CRRecentMay 24, 2026

MemMark: State-Evolution Attribution Watermarking for Agent Long-Term Memory Systems

Haobo Zhang, Xutao Mao, Guangyuan Dong, Ziwei Li +4 more

MemMark introduces a state-evolution attribution watermark that embeds owner-controlled signals into latent memory-write decisions, enabling robust provenance tracking for agent memory even when all t…

View →

cs.CRcs.AIRecentMay 19, 2026

Security Document Classification with a Fine-Tuned Local Large Language Model: Benchmark Data and an Open-Source System

Ivan Dobrovolskyi

The paper introduces TorchSight, an open-source local system using a fine-tuned Qwen 3.5 27B model that achieves high accuracy (95.0%) in classifying sensitive security documents without relying on ex…

View →