~ similar to 2603.19011v1· 20 results
The paper introduces AgentSecBench, a security evaluation framework that measures prompt injection, privacy leakage, and tool-use integrity in LLM agents by defining formal security games and testing…
Mingxuan Zhang, Jiahui Han, Dadi Guo, Songze Li +4 more
The paper introduces PrivacyPeek, a new benchmark that audits the acquisition stage of LLM-based agents to demonstrate that unnecessary acquisition of sensitive data is a widespread and critical priva…
Mingxuan Zhang, Jiahui Han, Dadi Guo, Songze Li +4 more
The paper introduces PrivacyPeek, a new benchmark that audits the acquisition stage of LLM-based agents to show that unnecessary and sensitive data acquisition is a widespread and critical privacy vul…
The paper proposes an attestation-aware promotion gate to mitigate supply-chain risks in LLM pipelines by cryptographically verifying and enforcing claims about training and release artifacts before d…
This survey analyzes the unique security threats posed by complex, multi-agent AI systems and proposes Confidential Computing (CC) using Trusted Execution Environments (TEEs) as a hardware-rooted defe…
The paper proposes an organization-scoped LLM agent runtime architecture designed to provide an auditable, model-agnostic platform for regulated cybersecurity operations, integrating deeply with exist…
The paper proposes a novel, organization-scoped LLM agent runtime architecture designed specifically for regulated cybersecurity operations, ensuring auditable context and integration with existing se…
The paper proposes Agentic Witnessing, a TEE-enabled framework that allows external verifiers to audit the qualitative properties of private datasets by querying an LLM-based auditor without accessing…
This paper analyzes the security of LLM-based autonomous agents by drawing parallels to operating system security, finding that while some vulnerabilities are inherent, many can be mitigated using est…
The paper systematically maps LLM agent vulnerabilities by testing 10,000 prompt variations, finding that 'goal reframing' language is the primary trigger for exploitation, rather than broad adversari…
The paper proposes Sello, a novel protocol that allows an owner to reconstruct a tamper-evident and verifiable record of AI agent actions by having a trusted receiver sign and publish receipts of the…
The paper argues that LLM agent security is fundamentally an agent-human interaction (AHI) problem, demonstrating that industry practices rely on human-centric mechanisms while academic research focus…
The paper analyzes the failure modes of current AI containment methods when the agent itself is the adversary, deriving five necessary architectural requirements for durable safety.
Yiqi Wang, Jiaqi Zhang, Taotao Cai, Zirui Liu +5 more
This survey provides a systematic framework and taxonomy for evidence tracing and execution provenance in LLM agents, addressing the difficulty of verifying and auditing complex agent behaviors.
Minfeng Qi, Tianqing Zhu, Zijie Xu, Congcong Zhu +2 more
The paper introduces CAESAR, a novel multi-agent framework that coordinates LLM agents across five specialized roles to improve success rates and stability in complex, multi-stage cyber intrusion task…
The paper introduces ASPI, a benchmark showing that requiring LLM agents to seek clarification significantly amplifies their vulnerability to prompt injection attacks.
Qian'ang Mao, Jiaxin Wang, Ya Liu, Li Zhu +2 more
The paper develops a unified, cross-layer security framework for autonomous LLM agents operating in agentic commerce, identifying key attack vectors and proposing a layered defense architecture.
Sina Abdollahi, Mohammad M Maheri, Javad Forough, Amir Al Sadi +4 more
AgenTEE is a system that enables the secure, confidential execution of complex LLM agent pipelines directly on edge devices by using isolated confidential virtual machines.
The paper proposes a two-stage post-training pipeline to create a small, local LLM agent (PrivEsc-LLM) capable of performing Linux privilege escalation, achieving high success rates while drastically…
Guangsheng Yu, Qin Wang, Rui Lang, Shuai Su +1 more
PlanTwin introduces a privacy-preserving architecture that allows cloud-hosted LLMs to plan over sensitive local environments by projecting the raw state into a sanitized, abstract digital twin.