~ similar to 2605.01186v1· 20 results
ZERO-APT introduces a novel closed-loop adversarial framework for automated penetration testing that simulates attacks against an intelligent, real-time defending system, achieving a high attack succe…
William Lugoloobi, Samuelle Marro, Jabez Magomere, Joss Wright +1 more
This paper demonstrates that an agent's behavioral patterns, captured through passive UI interaction traces, are sufficient to identify the underlying LLM model with high accuracy, posing a significan…
This study empirically measures the consistency and success rate of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation capabilit…
This study empirically measures the consistency and effectiveness of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation rates am…
Guangze Zhao, Yongzheng Zhang, Weilin Gai, Hongri Liu +2 more
HunterAgent is a neuro-symbolic framework that reconstructs causal attack chains from fragmented, anti-forensics-corrupted logs, achieving high accuracy while drastically reducing hallucination.
Jiaren Peng, Zeqin Li, Chang You, Yan Wang +16 more
This paper provides the first comprehensive systematization and large-scale empirical evaluation of existing LLM-based Automated Penetration Testing (AutoPT) frameworks, offering a structured taxonomy…
Kevin Eykholt, Dhilung Kirat, Xiaokui Shu, Jiyong Jang +2 more
The paper reports on penetration tests conducted on proprietary, large-scale AI agent systems, finding that security vulnerabilities persist despite stricter development standards.
Chong Xiang, Drew Zagieboylo, Shaona Ghosh, Sanjay Kariyappa +4 more
The paper proposes a vision for system-level defenses against indirect prompt injection attacks targeting AI agents, emphasizing structured control and human oversight.
The paper addresses the 'agent attribution' problem—the inability to trace harmful or misbehaving AI agents back to their deploying account—by proposing a robust, canary-based protocol for vendors to…
TraceGuard introduces a structured, multi-dimensional monitoring protocol that significantly improves the detection of subtle attacks in AI agents while maintaining collusion resistance.
Red-MIRROR is a novel multi-agent LLM system that automates complex web penetration testing by integrating a memory-reflection backbone, achieving superior performance on industry benchmarks.
Taein Lim, Seongyong Ju, Munhyeok Kim, Hyunjun Kim +1 more
The paper introduces CyBiasBench, a comprehensive benchmark that quantifies the inherent, agent-specific bias in LLM agents' attack selection patterns in cybersecurity scenarios.
The paper introduces AgentSecBench, a security evaluation framework that measures prompt injection, privacy leakage, and tool-use integrity in LLM agents by defining formal security games and testing…
The paper proposes a secure-by-design Generative AI framework that integrates PromptShield for LLM security and CIAF for structured cloud forensic investigation, significantly improving both robustnes…
Chia-Pei, Chen, Kentaroh Toyoda, Anita Lai +1 more
The paper introduces IPI-proxy, an open-source intercepting proxy toolkit designed to red-team web-browsing AI agents by injecting adversarial payloads into live HTTP responses from whitelisted domain…
Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more
This paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a dynamic defense mechanism that traces and sanitizes untrusted control content i…
Jiejun Tan, Zhicheng Dou, Xinyu Yang, Yuyang Hu +3 more
The paper introduces ClawTrojan, a benchmark for multi-step trojan attacks against LLM agents, and proposes DASGuard, a defense mechanism that detects and sanitizes backdoor content planted across mul…
The paper introduces a kill-chain canary methodology to diagnose prompt injection vulnerabilities across multi-stage LLM pipelines, revealing that write-node placement and document format are critical…
The paper proposes the Layered Attack Surface Model (LASM), a structural taxonomy that maps security threats and defenses across the complex, multi-layered architecture of AI agents, revealing signifi…
Mohan Zhang, Yuqi Jia, Zhen Tan, Steven Jiang +3 more
This study provides the first systematic measurement of prompt injection attacks in a real-world LLM-based resume screening application, finding that approximately 1% of resumes contain hidden injecti…