~ similar to 2603.21231v1· 20 results
Zelin Zhang, Qi Li, Jie Cao, Lingshuang Liu +1 more
The paper analyzes the escalating security and safety threats posed by generative AI systems as they transition from merely generating content to executing real-world actions via tools and agents, fin…
Ying Li, Hongbo Wen, Yanju Chen, Hanzhi Liu +2 more
The paper introduces Sefz, a semantic fuzzing framework that automatically discovers specification violations in LLM agent skills, finding a significant number of previously unknown exploitable guardr…
This paper systematically analyzes security risks in cloud-hosted, tool-enabled AI agents, concluding that most risks stem from over-privileged tools and capability-intent mismatches rather than novel…
Chong Xiang, Drew Zagieboylo, Shaona Ghosh, Sanjay Kariyappa +4 more
The paper proposes a vision for system-level defenses against indirect prompt injection attacks targeting AI agents, emphasizing structured control and human oversight.
Xuwei Ding, Skylar Zhai, Linxin Song, Jiate Li +5 more
The paper introduces OS-BLIND, a benchmark demonstrating that current safety evaluations fail to detect critical vulnerabilities in computer-use agents when user instructions are benign, showing high…
The paper proposes a Semantic Gateway and a Zero-Trust security model to formally validate and secure autonomous AI agents operating in enterprise systems, achieving a 100% discovery rate of unauthori…
Shiping Chen, Qin Wang, Guangsheng Yu, Xu Wang +1 more
This paper systematizes the security challenges of open agentic systems, concluding that while attack characterization is mature, the field lacks robust guidelines for operational governance, memory i…
This paper systematically maps the expanded attack surface of agentic AI systems, identifying new threat vectors like RAG poisoning and cross-agent manipulation, and proposes a comprehensive security…
Kevin Eykholt, Dhilung Kirat, Xiaokui Shu, Jiyong Jang +2 more
The paper reports on penetration tests conducted on proprietary, large-scale AI agent systems, finding that security vulnerabilities persist despite stricter development standards.
The paper introduces alignment contracts, a formal framework for specifying and enforcing behavioral constraints over observable effect traces, ensuring that powerful agentic security systems operate…
The paper proposes the Policy-Execution-Authorization (PEA) architecture, a separation-of-powers system designed to structurally enforce goal integrity in AI agents, moving safety from a probabilistic…
Huiyu Xu, Zhibo Wang, Wenhui Zhang, Ziqi Zhu +3 more
The paper introduces LoopTrap, an automated red-teaming framework that demonstrates how malicious prompts can poison the termination judgment of LLM agents, causing unbounded computation.
Zhichao Liu, Wenbo Pan, Haining Yu, Ge Gao +2 more
WebTrap introduces a stealthy, mid-task hijacking attack that successfully compromises browser agents during long-horizon tasks by seamlessly fusing malicious instructions with the original user goal.
Tanzim Ahad, Ismail Hossain, Md Jahangir Alam, Sai Puppala +3 more
The paper introduces Semantic Intent Fragmentation (SIF), an attack class demonstrating that multi-agent AI orchestrators can violate security policies through a composition of individually benign sub…
Vincent Siu, Jingxuan He, Kyle Montgomery, Zhun Wang +3 more
The paper introduces a contextual security framework for LLM agents, defining security properties and reformulating various attacks and defenses based on the context of execution.
The paper introduces Owner-Harm, a formal threat model addressing the critical blind spot of AI agents harming their own deployers, demonstrating that specialized defenses are needed beyond generic sa…
Mihai Christodorescu, Earlence Fernandes, Ashish Hooda, Somesh Jha +10 more
The paper argues that agent security must be treated as a systems problem, requiring the enforcement of security invariants at the system level rather than solely relying on improving the underlying A…
The paper analyzes how runtime safety enforcement impacts the performance of multi-step LLM agents, finding that while safety mechanisms can block unsafe actions, they impose a significant performance…
Fanxiao Li, Jiaying Wu, Tingchao Fu, Natasha Jaques +2 more
The paper introduces FlowSteer, a prompt-only attack that exploits vulnerabilities in how multi-agent LLM systems plan workflows, significantly increasing the success rate of malicious signal propagat…
The paper argues that LLM agent security is fundamentally an agent-human interaction (AHI) problem, demonstrating that industry practices rely on human-centric mechanisms while academic research focus…