Papers similar to 2605.21821v1

~ similar to 2605.21821v1· 20 results

cs.CRcs.AIRecentApr 1, 2026

Automated Framework to Evaluate and Harden LLM System Instructions against Encoding Attacks

Anubhab Sahu, Diptisha Samanta, Reza Soosahabi

The paper introduces an automated framework demonstrating that LLM system instructions are vulnerable to encoding attacks, where structured output requests can bypass safety refusals and leak sensitiv…

View →

cs.CRcs.LGRecentApr 30, 2026

Trident: Improving Malware Detection with LLMs and Behavioral Features

Rebecca Saul, Jingzhi Jiang, Elliott Chia, David Wagner

The paper introduces Trident, a novel malware detection system that combines static features, LLM-derived behavioral rules, and direct LLM analysis to achieve superior robustness against concept drift…

View →

cs.CRcs.AIRecentJun 1, 2026

Large Byte Model: Teaching Language Models About Compiled Code

Florian Störtz, Catalin-Andrei Stan, Alexandru Dinu, Sandra Servia-Rodríguez +3 more

The paper introduces the first byte-native Large Language Model (LLM) capable of analyzing raw executable binary data, achieving high accuracy in tasks like malware and architecture classification.

View →

cs.CRRecentApr 13, 2026

RedShell: A Generative AI-Based Approach to Ethical Hacking

Ricardo Bessa, Rui Claro, João Trindade, João Lourenço

The paper introduces RedShell, a generative AI tool designed to help ethical hackers generate syntactically and semantically valid malicious PowerShell code, addressing the challenge of data scarcity…

View →

cs.CRRecentApr 13, 2026

Towards Automated Pentesting with Large Language Models

Ricardo Bessa, Rui Claro, João Trindade, João Lourenço

The paper introduces RedShell, a hardware-efficient framework that uses fine-tuned LLMs to automate the generation of syntactically and semantically valid offensive PowerShell code for pentesting.

View →

cs.CRcs.AIcs.SERecentMar 17, 2026

Detecting Data Poisoning in Code Generation LLMs via Black-Box, Vulnerability-Oriented Scanning

Shenao Yan, Shimaa Ahmed, Shan Jin, Sunpreet S. Arora +3 more

The paper introduces CodeScan, a novel black-box framework that detects data poisoning in code generation LLMs by analyzing structural similarities across multiple generations to identify recurring, v…

View →

cs.CRRecentMay 15, 2026

MalwarePT: A Binary-Level Foundation Model for Malware Analysis

Saastha Vasan, Yuzhou Nie, Kaie Chen, Yigitcan Kaya +5 more

MalwarePT introduces a novel binary-level foundation model, pretrained on Windows PE code-section bytes using a ModernBERT-style encoder, demonstrating superior transfer learning capabilities across v…

View →

cs.CRcs.LGRecentMay 18, 2026

Learning to Look Benign: Targeted Evasion of Malware Detectors via API Import Injection

Juozas Dautartas, Olga Kurasova, Juozapas Rokas Čypas, Viktor Medvedev

The paper proposes a framework to intentionally evade malware detectors by adding a small number of benign API imports, successfully demonstrating targeted misclassification into a chosen benign categ…

View →

cs.CRRecentApr 25, 2026

AsmRAG: LLM-Driven Malware Detection by Retrieving Functionally Similar Assembly Code

ElMouatez Billah Karbab

AsmRAG is a novel framework that improves malware detection by treating it as an evidence-based retrieval task using a code-specialized LLM, achieving high accuracy while providing transparent forensi…

View →

cs.CRcs.AIcs.CLRecentMay 29, 2026

From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors

Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more

This paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a dynamic defense mechanism that traces and sanitizes untrusted control content i…

View →

cs.CRcs.AIcs.CLRecentMay 29, 2026

From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors

Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more

The paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a defense mechanism that detects and sanitizes backdoor content planted across mul…

View →

cs.CRcs.LGRecentApr 22, 2026

Breaking Bad: Interpretability-Based Safety Audits of State-of-the-Art LLMs

Krishiv Agarwal, Ramneet Kaur, Colin Samplawski, Manoj Acharya +5 more

The paper conducts an interpretability-driven safety audit of eight state-of-the-art LLMs, demonstrating that while interpretability-based steering is a powerful auditing tool, model robustness varies…

View →

cs.CRcs.AIRecentApr 29, 2026

Enforcing Benign Trajectories: A Behavioral Firewall for Structured-Workflow AI Agents

Hung Dang

The paper proposes extbackslash codeName, a behavioral firewall that uses a parameterized deterministic finite automaton (pDFA) to enforce verified benign tool-call sequences and parameter bounds for…

View →

cs.CRcs.AIRecentApr 13, 2026

ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection

Wei Zhao, Zhe Li, Peixin Zhang, Jun Sun

ClawGuard is a novel runtime security framework that deterministically enforces user-confirmed rules at tool-call boundaries to protect LLM agents from indirect prompt injection.

View →

cs.CRcs.AIRecentApr 26, 2026

Evaluation of Prompt Injection Defenses in Large Language Models

Priyal Deep, Shane Emmons, Amy Fox, Kyle Bacon +3 more

The paper evaluates prompt injection defenses and finds that only external output filtering, implemented in application code, reliably prevents secret leaks from LLMs, demonstrating that model-based d…

View →

cs.CRcs.LGRecentApr 24, 2026

Adversarial Malware Generation in Linux ELF Binaries via Semantic-Preserving Transformations

Lukáš Hrdonka, Martin Jureček

This paper addresses the lack of research on adversarial malware generation for Linux ELF binaries by developing a new semantic-preserving generator that achieves a high evasion rate against modern de…

View →

cs.CRcs.AIcs.MARecentApr 20, 2026

RAVEN: Retrieval-Augmented Vulnerability Exploration Network for Memory Corruption Analysis in User Code and Binary Programs

Parteek Jamwal, Minghao Shao, Boyuan Chen, Achyuta Muthuvelan +14 more

The paper introduces RAVEN, a Retrieval-Augmented Vulnerability Exploration Network, which uses LLM agents and RAG to automatically generate comprehensive, structured vulnerability analysis reports fo…

View →

cs.CRcs.LGRecentMay 25, 2026

Building an Adversarial Malware Dataset by Family and Type: Generation, Evasion, and Poisoning Evaluation

David Košťál, Martin Jureček

The paper constructs a large, adversarial malware dataset from real-world binaries, demonstrating high evasion rates and showing that even small amounts of poisoned data can severely compromise malwar…

View →

cs.CRRecentMay 5, 2026

The Infinite Mutation Engine? Measuring Polymorphism in LLM-Generated Offensive Code

Gabriel Hortea, Juan Tapiador

This paper quantifies the polymorphic capacity of a commercial LLM, demonstrating that it can cheaply generate large populations of structurally diverse, yet behaviorally equivalent, offensive code pa…

View →

cs.CRcs.LGcs.SERecentApr 21, 2026

Evaluating LLM-Generated Obfuscated XSS Payloads for Machine Learning-Based Detection

Divyesh Gabbireddy, Suman Saha

This paper proposes a structured pipeline using LLMs to generate and evaluate obfuscated XSS payloads, demonstrating that while LLMs can generate samples, they currently struggle to ensure payloads ma…

View →