~ similar to 2604.17517v2· 20 results
The paper introduces Agent Control Protocol (ACP), a stateful temporal admission control mechanism that enforces behavioral properties over execution traces to prevent harmful patterns from individual…
The paper proposes the Policy-Execution-Authorization (PEA) architecture, a separation-of-powers system designed to structurally enforce goal integrity in AI agents, moving safety from a probabilistic…
The paper introduces the concept of policy-invisible violations in LLM agents and proposes Sentinel, a counterfactual graph simulation framework, which significantly improves policy enforcement accura…
Baoyuan Wu, Qingshan Liu, Adel Bibi, Irwin King +1 more
The paper argues that the Authorization-Execution Gap (AEG)—the divergence between intended authorization and actual execution—is a critical safety and security flaw in open-world agents, requiring so…
The paper proposes Proof-Carrying Agent Actions (PCAA), a runtime-neutral governance model that uses action certificates to consistently track and authorize high-risk actions across diverse and hetero…
The paper introduces Distributed Sentinel, a zero-trust architecture that prevents Context-Fragmented Violations (CFVs) in multi-agent systems by propagating security state across departmental boundar…
The paper introduces the Open Agent Passport (OAP), a deterministic pre-action authorization framework that intercepts and validates AI agent tool calls against a declarative policy, achieving a 0% su…
Agent-Sentry is a runtime defense system that bounds the execution of LLM agents by learning a profile of benign behavior, effectively blocking malicious injections while maintaining high compatibilit…
The paper proposes the Redpanda Agentic Data Plane (ADP), an architecture that uses out-of-band metadata channels to deterministically enforce security policies and governance for autonomous AI agents…
The paper introduces alignment contracts, a formal framework for specifying and enforcing behavioral constraints over observable effect traces, ensuring that powerful agentic security systems operate…
The paper identifies and measures a critical failure mode where LLM agents violate policies by losing or corrupting directive-bearing state during the process of assembling the decision context, and p…
Xianyou Li, Weiran Yan, Yichao Wu, Penghao Liang +3 more
This paper introduces a failure-aware observability framework to diagnose wasted computation in multi-agent LLM systems by mapping recurring failure modes to online trace signals.
The paper proposes a compositional governance framework to provide richer, dynamic authorization semantics necessary for governing autonomous agentic AI systems, moving beyond traditional static IAM m…
The paper demonstrates that many instruction-tuned language models suffer from 'silent commitment failure,' meaning they can produce confidently incorrect outputs without any warning signal, and intro…
Mihai Christodorescu, Earlence Fernandes, Ashish Hooda, Somesh Jha +10 more
The paper argues that agent security must be treated as a systems problem, requiring the enforcement of security invariants at the system level rather than solely relying on improving the underlying A…
SentinelAgent introduces a formal framework, the Intent-Preserving Delegation Protocol (IPDP), to secure federal multi-agent AI systems by verifying complex delegation chains against seven properties,…
The paper introduces SafetyDrift, a predictive model that forecasts when AI agents will violate safety protocols by analyzing the cumulative risk across sequences of individually safe actions.
The paper introduces AgentSecBench, a security evaluation framework that measures prompt injection, privacy leakage, and tool-use integrity in LLM agents by defining formal security games and testing…
The paper introduces containment verification, a novel method that provides safety guarantees by formally verifying the agentic framework itself, ensuring safety regardless of the underlying AI model'…
The paper introduces the concept of the atomic decision boundary, proving that for autonomous systems to guarantee execution-time admissibility, the decision and the resulting state transition must oc…