Papers similar to 2604.17860v2

~ similar to 2604.17860v2· 20 results

cs.SEcs.CRRecentMay 27, 2026

Towards Demystifying and Repairing LLM-in-the-Loop Vulnerabilities

Yujie Ma, Jialin Rong, Chenxi Yang, Lili Quan +3 more

The paper addresses the gap in understanding real-world LLM-in-the-loop vulnerabilities by creating the LLMCVE dataset and demonstrating that these vulnerabilities are significantly harder to repair t…

View →

cs.CRRecentMay 20, 2026

VIPER-MCP: Detecting and Exploiting Taint-Style Vulnerabilities in Model Context Protocol Servers

Pengyu Sun, Qishu Jin, Enhao Huang, Zifeng Kang +3 more

VIPER-MCP is a novel, end-to-end automated framework that detects and dynamically confirms the exploitability of taint-style vulnerabilities in Model Context Protocol (MCP) servers, achieving high-fid…

View →

cs.CRRecentApr 27, 2026

AgentVisor: Defending LLM Agents Against Prompt Injection via Semantic Virtualization

Zonghao Ying, Haozheng Wang, Jiangfan Liu, Quanchen Zou +4 more

AgentVisor is a novel defense framework that uses semantic virtualization, inspired by OS principles, to significantly reduce LLM agent vulnerability to prompt injection while maintaining high utility…

View →

cs.CRcs.SERecentMay 20, 2026

FuzzingBrain V2: A Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction

Ze Sheng, Zhicheng Chen, Qingxiao Xu, Kewen Zhu +1 more

FuzzingBrain V2 is a multi-agent LLM system that significantly improves automated vulnerability discovery by ensuring all reported bugs are fuzzer-reproducible and handling complex cross-function depe…

View →

cs.CRRecentMay 3, 2026

AgenticVM: Agentic AI for Adaptive Software Vulnerability Management

Asrul Arifin, Hussain Ahmad, Yiyao Zhang, Diksha Goel

AgenticVM is a multi-agent framework that uses LLMs and specialized tools to automate and drastically reduce the volume of software vulnerabilities into actionable, prioritized queues.

View →

cs.CRRecentApr 22, 2026

Synthesizing Multi-Agent Harnesses for Vulnerability Discovery

Hanzhi Liu, Chaofan Shou, Xiaonan Liu, Hongbo Wen +3 more

The paper introduces AgentFlow, a novel framework that uses a typed graph DSL and feedback-driven optimization to automatically synthesize and improve multi-agent harnesses for discovering security vu…

View →

cs.CRcs.AIRecentMay 26, 2026

Lessons from Penetration Tests on Large-Scale Agent Systems

Kevin Eykholt, Dhilung Kirat, Xiaokui Shu, Jiyong Jang +2 more

The paper reports on penetration tests conducted on proprietary, large-scale AI agent systems, finding that security vulnerabilities persist despite stricter development standards.

View →

cs.CRcs.LGRecentApr 25, 2026

A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework

Kexin Chu

The paper proposes the Layered Attack Surface Model (LASM), a structural taxonomy that maps security threats and defenses across the complex, multi-layered architecture of AI agents, revealing signifi…

View →

cs.CRcs.AIRecentMar 24, 2026

Agent Audit: A Security Analysis System for LLM Agent Applications

Haiyue Zhang, Yi Nian, Yue Zhao

Agent Audit is a novel security analysis system that comprehensively audits LLM agent applications by examining the entire software stack—including tool code, configuration, and prompts—to detect a wi…

View →

cs.CRcs.AIRecentApr 29, 2026

Autonomous LLM Agents & CTFs: A Second Look

Youness Bouchari, Matteo Boffa, Marco Mellia, Idilio Drago +2 more

The paper re-evaluates LLM agents on CTFs, finding that while general-purpose agents like claude-code are strong baselines, specialized, modular architectures significantly improve performance and con…

View →

cs.CRRecentMar 30, 2026

VulnScout-C: A Lightweight Transformer for C Code Vulnerability Detection

Aymen Lassoued, Nacef Mbarek, Bechir Dardouri, Bassem Ouni +2 more

The paper introduces VULNSCOUT-C, a compact, specialized transformer model that achieves state-of-the-art performance in C code vulnerability detection while maintaining low inference cost, making it…

View →

cs.CRRecentMay 14, 2026

Toward Securing AI Agents Like Operating Systems

Lukas Pirch, Micha Horlboge, Patrick Großmann, Syeda Mahnur Asif +3 more

This paper analyzes the security of LLM-based autonomous agents by drawing parallels to operating system security, finding that while some vulnerabilities are inherent, many can be mitigated using est…

View →

cs.CRcs.AIRecentApr 3, 2026

Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis

Zhiyuan Li, Jingzheng Wu, Xiang Ling, Xing Cui +1 more

This paper provides the first comprehensive security analysis of the Agent Skills framework, identifying severe structural vulnerabilities that require fundamental architectural changes rather than si…

View →

cs.CRRecentMar 23, 2026

Semi-Automated Threat Modeling of Cloud-Based Systems Through Extracting Software Architecture from Configuration and Network Flow

Nicholas Pecka, Lotfi Ben Othmane, Bharat Bhargava, Renee Bryce

The paper proposes a novel semi-automated method to perform continuous threat modeling by inferring the actual system architecture from combined static configuration and dynamic network flow data, sig…

View →

cs.CRcs.AIcs.MARecentApr 20, 2026

RAVEN: Retrieval-Augmented Vulnerability Exploration Network for Memory Corruption Analysis in User Code and Binary Programs

Parteek Jamwal, Minghao Shao, Boyuan Chen, Achyuta Muthuvelan +14 more

The paper introduces RAVEN, a Retrieval-Augmented Vulnerability Exploration Network, which uses LLM agents and RAG to automatically generate comprehensive, structured vulnerability analysis reports fo…

View →

cs.CRRecentApr 9, 2026

Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain

Hanzhi Liu, Chaofan Shou, Hongbo Wen, Yanju Chen +2 more

This paper systematically analyzes the threat posed by malicious third-party API routers in the LLM supply chain, finding that a significant number of routers actively perform payload injection, crede…

View →

cs.CRRecentMay 19, 2026

Hunting Vulnerability Variants in AI Infra: Measurement and Reference-Driven Detection

Tian Dong, Yanjun Chen, Shoufeng Zhang, Huaien Zhang +5 more

This paper measures the prevalence of recurring vulnerability patterns (variants) across multiple AI infrastructure repositories and proposes INFRASCOPE, a framework to automatically detect these vari…

View →

cs.CRcs.AIRecentMar 29, 2026

A Security Analysis of the OpenClaw AI Agent Framework

Surada Suwansathit, Yuxuan Zhang, Guofei Gu

This paper analyzes 470 security advisories in the OpenClaw AI agent framework, demonstrating that the system's structural weakness lies in per-layer trust enforcement, enabling cross-layer remote cod…

View →

cs.CRcs.SERecentApr 5, 2026

LLM-Enabled Open-Source Systems in the Wild: An Empirical Study of Vulnerabilities in GitHub Security Advisories

Fariha Tanjim Shifat, Hariswar Baburaj, Ce Zhou, Jaydeb Sarker +1 more

The paper analyzes GitHub security advisories for LLM-integrated open-source systems, finding that while most vulnerabilities map to existing code-level weaknesses, the architectural risks like Supply…

View →

cs.CRcs.AIcs.SERecentApr 7, 2026

Hackers or Hallucinators? A Comprehensive Analysis of LLM-Based Automated Penetration Testing

Jiaren Peng, Zeqin Li, Chang You, Yan Wang +16 more

This paper provides the first comprehensive systematization and large-scale empirical evaluation of existing LLM-based Automated Penetration Testing (AutoPT) frameworks, offering a structured taxonomy…

View →