~ similar to 2604.08407v1· 20 results
Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more
The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these honeypots provide substantially longer and harder-to-detect…
Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more
The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these systems provide substantially longer and harder-to-detect i…
Yubin Qu, Yi Liu, Tongcheng Geng, Gelei Deng +4 more
The paper introduces Document-Driven Implicit Payload Execution (DDIPE) to demonstrate that malicious code can be embedded in LLM agent skill documentation, allowing supply-chain attacks to hijack age…
This study empirically measures the consistency and success rate of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation capabilit…
This study empirically measures the consistency and effectiveness of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation rates am…
The paper identifies a critical vulnerability, the Camouflage Detection Gap (CDG), where standard LLM injection detectors fail dramatically when malicious payloads mimic the target domain's language a…
The paper introduces Semantic Compliance Hijacking (SCH), a novel payload-less attack that exploits LLM agent supply chains by manipulating compliance rules to force unauthorized code generation, achi…
The paper introduces a systematic framework and defense mechanisms to analyze and mitigate autonomous LLM agent worms that propagate through persistent agent state and cross-platform multi-agent syste…
Qian'ang Mao, Jiaxin Wang, Ya Liu, Li Zhu +2 more
The paper develops a unified, cross-layer security framework for autonomous LLM agents operating in agentic commerce, identifying key attack vectors and proposing a layered defense architecture.
This paper analyzes 470 security advisories in the OpenClaw AI agent framework, demonstrating that the system's structural weakness lies in per-layer trust enforcement, enabling cross-layer remote cod…
Zonghao Ying, Haozheng Wang, Jiangfan Liu, Quanchen Zou +4 more
AgentVisor is a novel defense framework that uses semantic virtualization, inspired by OS principles, to significantly reduce LLM agent vulnerability to prompt injection while maintaining high utility…
The paper proposes an attestation-aware promotion gate to mitigate supply-chain risks in LLM pipelines by cryptographically verifying and enforcing claims about training and release artifacts before d…
The paper introduces 'causality laundering,' a novel security vulnerability in tool-calling LLM agents where adversaries exfiltrate information by probing denied actions, and proposes the Agentic Refe…
Soham Roy, Sarthakbrata Halder, Arya Bharaty, Vaibhav Bhaskar +4 more
The paper demonstrates that autonomous web agents are highly susceptible to social-engineering attacks, leaking critical PII even when they internally flag a site as suspicious, necessitating output-l…
Soham Roy, Sarthakbrata Halder, Arya Bharaty, Vaibhav Bhaskar +4 more
The paper demonstrates that autonomous web agents are highly susceptible to social-engineering attacks, leaking critical PII even when they internally flag a site as suspicious, necessitating output-l…
This paper proposes the first web-focused threat model for agentic browsers, demonstrating that traditional web social engineering attacks can be amplified into dangerous, reproducible threats when ex…
The paper introduces ClawTrap, a MITM-based red-teaming framework, to evaluate the security robustness of web agents like OpenClaw against dynamic, real-world network attacks, finding that model stren…
Yiheng Huang, Zhijia Zhao, Bihuan Chen, Susheng Wu +4 more
This paper introduces a component-centric framework and a novel detector, Connor, to understand and detect sophisticated, multi-component attacks targeting the Model Context Protocol (MCP) servers.
Sina Abdollahi, Mohammad M Maheri, Javad Forough, Amir Al Sadi +4 more
AgenTEE is a system that enables the secure, confidential execution of complex LLM agent pipelines directly on edge devices by using isolated confidential virtual machines.
Ting Zhang, Yikun Li, Chengran Yang, Ratnadira Widyasari +14 more
TitanCA presents a novel, multi-agent LLM orchestration framework that significantly improves vulnerability discovery by reducing false positives and identifying numerous zero-day vulnerabilities.