Papers similar to 2604.02767v1

~ similar to 2604.02767v1· 20 results

cs.MAcs.AIcs.CRRecentApr 24, 2026

Beyond Single-Agent Alignment: Preventing Context-Fragmented Violations in Multi-Agent Systems

The paper introduces Distributed Sentinel, a zero-trust architecture that prevents Context-Fragmented Violations (CFVs) in multi-agent systems by propagating security state across departmental boundar…

View →

cs.AIcs.CRRecentApr 26, 2026

Structural Enforcement of Goal Integrity in AI Agents via Separation-of-Powers Architecture

Rong Xiang

The paper proposes the Policy-Execution-Authorization (PEA) architecture, a separation-of-powers system designed to structurally enforce goal integrity in AI agents, moving safety from a probabilistic…

View →

cs.CRcs.AIRecentMay 7, 2026

From Specification to Deployment: Empirical Evidence from a W3C VC + DID Trust Infrastructure for Autonomous Agents

Lars Kersten Kroehl

The paper introduces MolTrust, a production-deployed trust infrastructure built on W3C standards (VCs and DIDs) that provides a verifiable, multi-layered authorization framework for autonomous AI agen…

View →

cs.SEcs.AIcs.CRRecentJun 2, 2026

Proof-Carrying Agent Actions: Model-Agnostic Runtime Governance for Heterogeneous Agent Systems

Zexun Wang

The paper proposes Proof-Carrying Agent Actions (PCAA), a runtime-neutral governance model that uses action certificates to consistently track and authorize high-risk actions across diverse and hetero…

View →

cs.CRRecentMar 25, 2026

AgentRFC: Security Design Principles and Conformance Testing for Agent Protocols

Shenghan Zheng, Qifan Zhang

The paper introduces a comprehensive security framework, AgentRFC, to systematically analyze and test the security conformance of various AI agent protocols, identifying critical design gaps, especial…

View →

cs.CRcs.AIcs.LGRecentApr 8, 2026

Semantic Intent Fragmentation: A Single-Shot Compositional Attack on Multi-Agent AI Pipelines

Tanzim Ahad, Ismail Hossain, Md Jahangir Alam, Sai Puppala +3 more

The paper introduces Semantic Intent Fragmentation (SIF), an attack class demonstrating that multi-agent AI orchestrators can violate security policies through a composition of individually benign sub…

View →

cs.AIcs.CRRecentJun 2, 2026

Overlaying Governance: A Compositional Authorization Framework for Delegation and Scope in Agentic AI

Amjad Ibrahim, Yong Li

The paper proposes a compositional governance framework to provide richer, dynamic authorization semantics necessary for governing autonomous agentic AI systems, moving beyond traditional static IAM m…

View →

cs.CRcs.AIRecentMar 24, 2026

Agent-Sentry: Bounding LLM Agents via Execution Provenance

Rohan Sequeira, Stavros Damianakis, Umar Iqbal, Konstantinos Psounis

Agent-Sentry is a runtime defense system that bounds the execution of LLM agents by learning a profile of benign behavior, effectively blocking malicious injections while maintaining high compatibilit…

View →

cs.CRcs.AIRecentMay 4, 2026

When Agents Handle Secrets: A Survey of Confidential Computing for Agentic AI

Javad Forough, Marios Kogias, Hamed Haddadi

This survey analyzes the unique security threats posed by complex, multi-agent AI systems and proposes Confidential Computing (CC) using Trusted Execution Environments (TEEs) as a hardware-rooted defe…

View →

cs.CRcs.AIRecentMay 26, 2026

Grimlock: Guarding High-Agency Systems with eBPF and Attested Channels

Qiancheng Wu, Wenhui Zhang, Gan Fang, Sheng Mao +4 more

Grimlock is an Agent Guard that enhances security for high-agency systems by enforcing identity, authorization, and scope-bound communication through eBPF and attested TLS channels, without modifying…

View →

cs.AIcs.CLcs.CRRecentApr 14, 2026

Policy-Invisible Violations in LLM-Based Agents

Jie Wu, Ming Gong

The paper introduces the concept of policy-invisible violations in LLM agents and proposes Sentinel, a counterfactual graph simulation framework, which significantly improves policy enforcement accura…

View →

cs.CRcs.AIRecentMay 10, 2026

The Authorization-Execution Gap Is a Major Safety and Security Problem in Open-World Agents

Baoyuan Wu, Qingshan Liu, Adel Bibi, Irwin King +1 more

The paper argues that the Authorization-Execution Gap (AEG)—the divergence between intended authorization and actual execution—is a critical safety and security flaw in open-world agents, requiring so…

View →

cs.CRcs.AIRecentMar 21, 2026

Before the Tool Call: Deterministic Pre-Action Authorization for Autonomous AI Agents

Uchi Uchibeke

The paper introduces the Open Agent Passport (OAP), a deterministic pre-action authorization framework that intercepts and validates AI agent tool calls against a declarative policy, achieving a 0% su…

View →

cs.CRcs.AIcs.SERecentMay 21, 2026

Benchmarking Autonomous Agents against Temporal, Spatial, and Semantic Evasions

Jianan Ma, Xiaohu Du, Ruixiao Lin, Yaoxiang Bian +7 more

The paper introduces a multi-dimensional evasion framework and a new benchmark (A3S-Bench) to test autonomous agents, demonstrating that stateful, multi-turn attacks significantly increase system risk…

View →

cs.AIcs.CRRecentMay 5, 2026

Redefining AI Red Teaming in the Agentic Era: From Weeks to Hours

Raja Sekhar Rao Dheekonda, Will Pearce, Nick Landers

The paper introduces an AI red teaming agent that drastically reduces the time and effort required for security testing by allowing operators to define complex attack goals using natural language, com…

View →

cs.AIcs.CRRecentMar 22, 2026

Session Risk Memory (SRM): Temporal Authorization for Deterministic Pre-Execution Safety Gates

Florin Adrian Chitan

The paper introduces Session Risk Memory (SRM), a lightweight module that enhances per-action authorization gates with trajectory-level risk assessment, significantly improving detection of distribute…

View →

cs.CRcs.AIcs.CLRecentMay 4, 2026

MAGE: Safeguarding LLM Agents against Long-Horizon Threats via Shadow Memory

Yuhui Wang, Tanqiu Jiang, Jiacheng Liang, Charles Fleming +1 more

The paper introduces MAGE, a novel defensive framework that uses a dedicated 'shadow memory' to proactively detect and mitigate long-horizon threats against LLM agents during complex, multi-step inter…

View →

cs.CRRecentMay 25, 2026

AgentSecBench: Measuring Prompt Injection, Privacy Leakage, and Tool-Use Integrity in LLM Agents

Faruk Alpay, Taylan Alpay

The paper introduces AgentSecBench, a security evaluation framework that measures prompt injection, privacy leakage, and tool-use integrity in LLM agents by defining formal security games and testing…

View →

cs.AIcs.CRcs.SERecentMay 9, 2026

Containment Verification: AI Safety Guarantees Independent of Alignment

Royce Moon, Lav R. Varshney

The paper introduces containment verification, a novel method that provides safety guarantees by formally verifying the agentic framework itself, ensuring safety regardless of the underlying AI model'…

View →

cs.CRcs.AIRecentMay 18, 2026

Agent Security is a Systems Problem

Mihai Christodorescu, Earlence Fernandes, Ashish Hooda, Somesh Jha +10 more

The paper argues that agent security must be treated as a systems problem, requiring the enforcement of security invariants at the system level rather than solely relying on improving the underlying A…

View →