Papers similar to 2605.11039v1

~ similar to 2605.11039v1· 20 results

cs.CRRecentMay 26, 2026

Aligning Provenance with Authorization: A Dual-Graph Defense for LLM Agents

The paper proposes AuthGraph, a dual-graph defense framework that structurally compares information provenance (what data was used) against a clean authorization baseline to detect fine-grained, param…

View →

cs.CRcs.AIRecentJun 3, 2026

From Agent Traces to Trust: Evidence Tracing and Execution Provenance in LLM Agents

Yiqi Wang, Jiaqi Zhang, Taotao Cai, Zirui Liu +5 more

This survey provides a systematic framework and taxonomy for evidence tracing and execution provenance in LLM agents, addressing the difficulty of verifying and auditing complex agent behaviors.

View →

cs.CRcs.AIcs.MARecentMay 1, 2026

Skills as Verifiable Artifacts: A Trust Schema and a Biconditional Correctness Criterion for Human-in-the-Loop Agent Runtimes

Alfredo Metere

The paper proposes a trust schema and verification framework to ensure that agent skills, which augment LLMs, are rigorously verified before deployment, thereby making human-in-the-loop oversight scal…

View →

cs.CRRecentApr 27, 2026

AgentVisor: Defending LLM Agents Against Prompt Injection via Semantic Virtualization

Zonghao Ying, Haozheng Wang, Jiangfan Liu, Quanchen Zou +4 more

AgentVisor is a novel defense framework that uses semantic virtualization, inspired by OS principles, to significantly reduce LLM agent vulnerability to prompt injection while maintaining high utility…

View →

cs.CRRecentMay 9, 2026

When LLMs Team Up: A Coordinated Attack Framework for Automated Cyber Intrusions

Minfeng Qi, Tianqing Zhu, Zijie Xu, Congcong Zhu +2 more

The paper introduces CAESAR, a novel multi-agent framework that coordinates LLM agents across five specialized roles to improve success rates and stability in complex, multi-stage cyber intrusion task…

View →

cs.CRRecentMay 25, 2026

AgentSecBench: Measuring Prompt Injection, Privacy Leakage, and Tool-Use Integrity in LLM Agents

Faruk Alpay, Taylan Alpay

The paper introduces AgentSecBench, a security evaluation framework that measures prompt injection, privacy leakage, and tool-use integrity in LLM agents by defining formal security games and testing…

View →

cs.AIcs.CRRecentMay 6, 2026

AgentTrust: Runtime Safety Evaluation and Interception for AI Agent Tool Use

Chenglin Yang

AgentTrust is a novel runtime safety layer that intercepts and evaluates AI agent tool calls before execution, achieving high accuracy in detecting unsafe actions across complex and obfuscated scenari…

View →

cs.CRcs.AIRecentMar 24, 2026

Agent-Sentry: Bounding LLM Agents via Execution Provenance

Rohan Sequeira, Stavros Damianakis, Umar Iqbal, Konstantinos Psounis

Agent-Sentry is a runtime defense system that bounds the execution of LLM agents by learning a profile of benign behavior, effectively blocking malicious injections while maintaining high compatibilit…

View →

cs.CRcs.SERecentMay 5, 2026

ARGUS: Defending LLM Agents Against Context-Aware Prompt Injection

Shihao Weng, Yang Feng, Jinrui Zhang, Xiaofei Xie +2 more

The paper introduces ARGUS, a defense mechanism that uses provenance-aware decision auditing to protect LLM agents from sophisticated, context-aware prompt injection attacks, significantly reducing th…

View →

cs.CRcs.AIRecentMar 29, 2026

A Security Analysis of the OpenClaw AI Agent Framework

Surada Suwansathit, Yuxuan Zhang, Guofei Gu

This paper analyzes 470 security advisories in the OpenClaw AI agent framework, demonstrating that the system's structural weakness lies in per-layer trust enforcement, enabling cross-layer remote cod…

View →

cs.CRcs.AIRecentMay 7, 2026

From Specification to Deployment: Empirical Evidence from a W3C VC + DID Trust Infrastructure for Autonomous Agents

Lars Kersten Kroehl

The paper introduces MolTrust, a production-deployed trust infrastructure built on W3C standards (VCs and DIDs) that provides a verifiable, multi-layered authorization framework for autonomous AI agen…

View →

cs.CRcs.AIRecentApr 5, 2026

Causality Laundering: Denial-Feedback Leakage in Tool-Calling LLM Agents

Mohammad Hossein Chinaei

The paper introduces 'causality laundering,' a novel security vulnerability in tool-calling LLM agents where adversaries exfiltrate information by probing denied actions, and proposes the Agentic Refe…

View →

cs.CRRecentApr 25, 2026

When the Agent Is the Adversary: Architectural Requirements for Agentic AI Containment After the April 2026 Frontier Model Escape

Richard Joseph Mitchell

The paper analyzes the failure modes of current AI containment methods when the agent itself is the adversary, deriving five necessary architectural requirements for durable safety.

View →

cs.CRcs.AIRecentMay 14, 2026

MemLineage: Lineage-Guided Enforcement for LLM Agent Memory

Ciyan Ouyang, Rui Hou

MemLineage introduces a novel, cryptographically-backed defense mechanism that enforces a chain-of-custody for LLM agent memory, preventing untrusted or poisoned state from justifying sensitive action…

View →

cs.CRcs.AIRecentMay 26, 2026

ChainCaps: Composition-Safe Tool-Using Agents via Monotonic Capability Attenuation

Xiaochong Jiang, Shiqi Yang, Ziwei Li, Lifei Liu +2 more

ChainCaps introduces a novel runtime capability budgeting system that prevents 'permission laundering' in complex tool-using agents, significantly reducing attack success rates while maintaining benig…

View →

cs.CRcs.AIRecentMay 7, 2026

PragLocker: Protecting Agent Intellectual Property in Untrusted Deployments via Non-Portable Prompts

Qinfeng Li, Yuntai Bao, Jianghui Hu, Wenqi Zhang +4 more

PragLocker is a novel prompt protection scheme that secures valuable LLM agent prompts against theft and reuse by other proprietary models by making them non-portable.

View →

cs.CRRecentMay 27, 2026

AgentGuard: An Attribute-Based Access Control Framework for Tool-Use LLM-Based Agent

Jiaqi Luo, Songyang Peng, Jiarun Dai, Zhile Chen +5 more

AgentGuard is an attribute-based access control framework designed to mitigate severe security risks, such as privacy leakage and system compromise, in tool-using LLM-based agents.

View →

cs.CRcs.AIRecentMay 26, 2026

Grimlock: Guarding High-Agency Systems with eBPF and Attested Channels

Qiancheng Wu, Wenhui Zhang, Gan Fang, Sheng Mao +4 more

Grimlock is an Agent Guard that enhances security for high-agency systems by enforcing identity, authorization, and scope-bound communication through eBPF and attested TLS channels, without modifying…

View →

cs.CRRecentMar 30, 2026

Attesting LLM Pipelines: Enforcing Verifiable Training and Release Claims

Zhuoran Tan, Jeremy Singer, Christos Anagnostopoulos

The paper proposes an attestation-aware promotion gate to mitigate supply-chain risks in LLM pipelines by cryptographically verifying and enforcing claims about training and release artifacts before d…

View →

cs.CRcs.AIcs.SERecentMay 5, 2026

Cryptographic Registry Provenance: Structural Defense Against Dependency Confusion in AI Package Ecosystems

Alan L. McCann

The paper proposes a comprehensive cryptographic distribution provenance system to structurally defend against dependency confusion attacks in software package ecosystems.

View →