~ similar to 2605.16035v1· 20 results
The paper introduces Owner-Harm, a formal threat model addressing the critical blind spot of AI agents harming their own deployers, demonstrating that specialized defenses are needed beyond generic sa…
Mihai Christodorescu, Earlence Fernandes, Ashish Hooda, Somesh Jha +10 more
The paper argues that agent security must be treated as a systems problem, requiring the enforcement of security invariants at the system level rather than solely relying on improving the underlying A…
The paper proposes the Redpanda Agentic Data Plane (ADP), an architecture that uses out-of-band metadata channels to deterministically enforce security policies and governance for autonomous AI agents…
PocketAgents introduces a manifest-driven framework for autonomous defense agents, enabling measurable and attributable LLM-driven security responses by strictly controlling agent actions and telemetr…
The paper proposes a novel, empirical methodology called 'backchaining' to derive and prioritize Loss of Control (LoC) mitigations by analyzing the errors an AI system makes on mission-specific nation…
The paper proposes Sello, a novel protocol that allows an owner to reconstruct a tamper-evident and verifiable record of AI agent actions by having a trusted receiver sign and publish receipts of the…
Minghui Xu, Xiaoyu Liu, Yihao Guo, Chunchi Liu +2 more
The paper proposes AgentDID, a decentralized framework using DIDs and verifiable credentials to provide trustless identity authentication and dynamic state verification for autonomous, self-managed AI…
Kevin Eykholt, Dhilung Kirat, Xiaokui Shu, Jiyong Jang +2 more
The paper reports on penetration tests conducted on proprietary, large-scale AI agent systems, finding that security vulnerabilities persist despite stricter development standards.
Qian'ang Mao, Jiaxin Wang, Ya Liu, Li Zhu +2 more
The paper develops a unified, cross-layer security framework for autonomous LLM agents operating in agentic commerce, identifying key attack vectors and proposing a layered defense architecture.
The paper introduces an AI red teaming agent that drastically reduces the time and effort required for security testing by allowing operators to define complex attack goals using natural language, com…
The paper proposes and tests a novel, non-security 'Recuse Signal'—an in-band signal—to allow operators to tell autonomous LLM agents to voluntarily withdraw access, demonstrating that compliant agent…
Hanzhi Liu, Chaofan Shou, Hongbo Wen, Yanju Chen +2 more
This paper systematically analyzes the threat posed by malicious third-party API routers in the LLM supply chain, finding that a significant number of routers actively perform payload injection, crede…
Zihan Wang, Rui Zhang, Yu Liu, Chi Liu +3 more
This paper presents the first systematic study of black-box skill stealing attacks against proprietary LLM agents, demonstrating that structured agent skills can be easily extracted, posing a signific…
The paper introduces MolTrust, a production-deployed trust infrastructure built on W3C standards (VCs and DIDs) that provides a verifiable, multi-layered authorization framework for autonomous AI agen…
Baoyuan Wu, Qingshan Liu, Adel Bibi, Irwin King +1 more
The paper argues that the Authorization-Execution Gap (AEG)—the divergence between intended authorization and actual execution—is a critical safety and security flaw in open-world agents, requiring so…
The paper argues that traditional identity-based reputation mechanisms are structurally inapplicable to language model agents because their mutable, modular nature makes them ontologically dissociativ…
Zelin Zhang, Qi Li, Jie Cao, Lingshuang Liu +1 more
The paper analyzes the escalating security and safety threats posed by generative AI systems as they transition from merely generating content to executing real-world actions via tools and agents, fin…
The paper defines AI Identity as the correspondence between an agent's declared state and its observed behavior, concluding that current infrastructure and standards are fundamentally inadequate for g…
The paper introduces NeuroTaint, a novel taint tracking framework that adapts information flow analysis for LLM agents by modeling taint propagation as semantic transformation and causal influence, si…
Hanzhi Liu, Chaofan Shou, Xiaonan Liu, Hongbo Wen +3 more
The paper introduces AgentFlow, a novel framework that uses a typed graph DSL and feedback-driven optimization to automatically synthesize and improve multi-agent harnesses for discovering security vu…