Papers similar to 2605.05000v1

~ similar to 2605.05000v1· 20 results

cs.CRcs.AIcs.SERecentMay 15, 2026

Detecting Privilege Escalation in Polyglot Microservices via Agentic Program Analysis

Penghui Li, Hong Yau Chong, Yinzhi Cao, Junfeng Yang

The paper introduces Neo, an agentic program analysis framework that successfully detects zero-day privilege escalation vulnerabilities in complex, polyglot microservices by combining LLMs with advanc…

View →

cs.CRRecentMay 20, 2026

VIPER-MCP: Detecting and Exploiting Taint-Style Vulnerabilities in Model Context Protocol Servers

Pengyu Sun, Qishu Jin, Enhao Huang, Zifeng Kang +3 more

VIPER-MCP is a novel, end-to-end automated framework that detects and dynamically confirms the exploitability of taint-style vulnerabilities in Model Context Protocol (MCP) servers, achieving high-fid…

View →

cs.CRcs.SERecentApr 7, 2026

Guiding Symbolic Execution with Static Analysis and LLMs for Vulnerability Discovery

Md Shafiuzzaman, Achintya Desai, Wenbo Guo, Tevfik Bultan

SAILOR automates the construction of symbolic execution harnesses by combining static analysis and LLM-based synthesis, significantly improving the scalability and effectiveness of vulnerability disco…

View →

cs.CRRecentMar 30, 2026

VulnScout-C: A Lightweight Transformer for C Code Vulnerability Detection

Aymen Lassoued, Nacef Mbarek, Bechir Dardouri, Bassem Ouni +2 more

The paper introduces VULNSCOUT-C, a compact, specialized transformer model that achieves state-of-the-art performance in C code vulnerability detection while maintaining low inference cost, making it…

View →

cs.CRRecentApr 27, 2026

AgentVisor: Defending LLM Agents Against Prompt Injection via Semantic Virtualization

Zonghao Ying, Haozheng Wang, Jiangfan Liu, Quanchen Zou +4 more

AgentVisor is a novel defense framework that uses semantic virtualization, inspired by OS principles, to significantly reduce LLM agent vulnerability to prompt injection while maintaining high utility…

View →

cs.CRcs.AIcs.MARecentApr 20, 2026

RAVEN: Retrieval-Augmented Vulnerability Exploration Network for Memory Corruption Analysis in User Code and Binary Programs

Parteek Jamwal, Minghao Shao, Boyuan Chen, Achyuta Muthuvelan +14 more

The paper introduces RAVEN, a Retrieval-Augmented Vulnerability Exploration Network, which uses LLM agents and RAG to automatically generate comprehensive, structured vulnerability analysis reports fo…

View →

cs.CRcs.AIRecentMay 26, 2026

Lessons from Penetration Tests on Large-Scale Agent Systems

Kevin Eykholt, Dhilung Kirat, Xiaokui Shu, Jiyong Jang +2 more

The paper reports on penetration tests conducted on proprietary, large-scale AI agent systems, finding that security vulnerabilities persist despite stricter development standards.

View →

cs.GTcs.CRcs.OSRecentApr 9, 2026

VCAO: Verifier-Centered Agentic Orchestration for Strategic OS Vulnerability Discovery

Suyash Mishra

The paper introduces VCAO, a novel verifier-centered agentic orchestration framework that models OS vulnerability discovery as a Bayesian Stackelberg game, significantly improving vulnerability discov…

View →

cs.CRcs.SERecentMay 20, 2026

FuzzingBrain V2: A Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction

Ze Sheng, Zhicheng Chen, Qingxiao Xu, Kewen Zhu +1 more

FuzzingBrain V2 is a multi-agent LLM system that significantly improves automated vulnerability discovery by ensuring all reported bugs are fuzzer-reproducible and handling complex cross-function depe…

View →

cs.CRcs.AIcs.CLRecentMay 29, 2026

From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors

Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more

This paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a dynamic defense mechanism that traces and sanitizes untrusted control content i…

View →

cs.CRcs.AIcs.CLRecentMay 29, 2026

From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors

Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more

The paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a defense mechanism that detects and sanitizes backdoor content planted across mul…

View →

cs.CRRecentApr 8, 2026

PoC-Adapt: Semantic-Aware Automated Vulnerability Reproduction with LLM Multi-Agents and Reinforcement Learning-Driven Adaptive Policy

Phan The Duy, Khoa Ngo-Khanh, Nguyen Huu Quyen, Van-Hau Pham

PoC-Adapt is an end-to-end framework that significantly improves the reliability and efficiency of automated vulnerability exploitation by integrating semantic state validation and reinforcement learn…

View →

cs.SEcs.AIcs.CLRecentApr 13, 2026

AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection

Zijie Zhao, Chenyuan Yang, Weidong Wang, Yihan Yang +2 more

AnyPoC introduces a general multi-agent framework that reliably generates and validates executable Proof-of-Concept (PoC) tests from candidate bug reports, significantly improving automated bug detect…

View →

cs.CRcs.SERecentMay 5, 2026

Generating Proof-of-Vulnerability Tests to Help Enhance the Security of Complex Software

Shravya Kanchi, Xiaoyan Zang, Ying Zhang, Danfeng Yao +1 more

The paper introduces PoVSmith, an agent-based system that uses large language models and call path analysis to automatically generate and assess proof-of-vulnerability tests, significantly improving t…

View →

cs.CRcs.AIRecentMay 4, 2026

When Agents Handle Secrets: A Survey of Confidential Computing for Agentic AI

Javad Forough, Marios Kogias, Hamed Haddadi

This survey analyzes the unique security threats posed by complex, multi-agent AI systems and proposes Confidential Computing (CC) using Trusted Execution Environments (TEEs) as a hardware-rooted defe…

View →

cs.CRcs.PLcs.SERecentApr 28, 2026

Symbolic Execution Meets Multi-LLM Orchestration: Detecting Memory Vulnerabilities in Incomplete Rust CVE Snippets

Zeyad Abdelrazek, Young Lee

The paper introduces a novel multi-LLM orchestration system combined with symbolic execution to successfully detect memory vulnerabilities in uncompilable, incomplete Rust CVE code snippets, achieving…

View →

cs.CRRecentMay 25, 2026

AgentSecBench: Measuring Prompt Injection, Privacy Leakage, and Tool-Use Integrity in LLM Agents

Faruk Alpay, Taylan Alpay

The paper introduces AgentSecBench, a security evaluation framework that measures prompt injection, privacy leakage, and tool-use integrity in LLM agents by defining formal security games and testing…

View →

cs.CRcs.LGRecentApr 25, 2026

A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework

Kexin Chu

The paper proposes the Layered Attack Surface Model (LASM), a structural taxonomy that maps security threats and defenses across the complex, multi-layered architecture of AI agents, revealing signifi…

View →

cs.CRcs.AIRecentMay 7, 2026

Patch2Vuln: Agentic Reconstruction of Vulnerabilities from Linux Distribution Binary Patches

Isaac David, Arthur Gervais

The paper introduces Patch2Vuln, a pipeline that uses an LLM agent to reconstruct security vulnerabilities by analyzing differences between old and new Linux binary packages, successfully localizing p…

View →

cs.CRcs.SERecentMay 11, 2026

Agentic Fuzzing: Opportunities and Challenges

Junyoung Park, Insu Yun

The paper proposes agentic fuzzing, a novel bug-finding approach where deep agents perform direct reasoning based on historical bugs to discover logic bugs in mature codebases.

View →