~ similar to 2605.29801v1· 20 results
Dongrui Liu, Yu Li, Zhonghao Yang, Peng Wang +46 more
The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex, open-world agentic scenarios.
AgentTrust is a novel runtime safety layer that intercepts and evaluates AI agent tool calls before execution, achieving high accuracy in detecting unsafe actions across complex and obfuscated scenari…
AgentWall is a runtime safety layer that intercepts and evaluates all proposed actions from local AI agents against a declarative policy, ensuring safety before execution.
Haoyu Wang, Zibo Xiao, Yedi Zhang, Christopher M. Poskitt +1 more
The paper proposes SafeClaw-R, a novel framework that enforces safety as a system-level invariant over the execution graph to mitigate the high safety and security risks inherent in autonomous multi-a…
Yuhang Wang, Haichang Gao, Zhenxing Niu, Zhaoxiang Liu +3 more
The paper systematically evaluates six OpenClaw-series AI agent frameworks, demonstrating that these agentized systems possess significant security vulnerabilities that are distinct from and more seve…
The paper introduces BOA, a novel framework that measures agent safety by exhaustively searching the entire in-budget trajectory space, thereby identifying unsafe behaviors missed by traditional sampl…
Yining Hong, Yining She, Eunsuk Kang, Christopher S. Timperley +1 more
The paper proposes and evaluates symbolic guardrails as a practical method to provide strong, verifiable safety and security guarantees for domain-specific AI agents without compromising their utility…
Zijun Wang, Haoqin Tu, Letian Zhang, Hardy Chen +10 more
This paper conducts the first real-world safety evaluation of the personal AI agent OpenClaw, demonstrating that its broad system access creates inherent vulnerabilities that significantly increase th…
Chang Jin, An Wang, Zeming Wei, Kai Wang +6 more
The paper introduces SkillSafetyBench, a comprehensive benchmark demonstrating that agent safety failures often stem from adversarial influences within reusable skills and execution environments, rath…
This paper analyzes 470 security advisories in the OpenClaw AI agent framework, demonstrating that the system's structural weakness lies in per-layer trust enforcement, enabling cross-layer remote cod…
The paper introduces Owner-Harm, a formal threat model addressing the critical blind spot of AI agents harming their own deployers, demonstrating that specialized defenses are needed beyond generic sa…
Jinhu Qi, Muzhi Li, Jiahong Liu, Yuqin Shu +8 more
This survey provides a comprehensive, practical guide to ensuring the trustworthiness of complex, autonomous agentic AI systems by focusing on safety, robustness, privacy, and system security.
Dongwook Choi, Taeyoon Kwon, Bogyung Jeong, Minju Kim +5 more
EMBGuard introduces a novel, MLLM-based safety guardrail that explicitly identifies and explains physical hazards from (visual observation, action) pairs, enabling safer planning for embodied agents.
This paper provides a systematic, layered review of security risks and defense strategies for autonomous agent frameworks, using OpenClaw as a case study to address the current lack of integrated rese…
This paper analyzes the security of LLM-based autonomous agents by drawing parallels to operating system security, finding that while some vulnerabilities are inherent, many can be mitigated using est…
Xian Qi Loye, Qinglin Su, Zhexin Zhang, Shiyao Cui +4 more
The paper introduces RUBAS, a rubric-based reinforcement learning framework that improves agent safety by providing fine-grained, multi-dimensional rewards for complex tool-use scenarios.
Songyang Liu, Chaozhuo Li, Chenxu Wang, Jinyu Hou +7 more
ClawKeeper is a comprehensive, multi-layered security framework designed to mitigate critical vulnerabilities in autonomous agent runtimes like OpenClaw by enforcing protection across skills, plugins,…
The paper introduces Distributed Sentinel, a zero-trust architecture that prevents Context-Fragmented Violations (CFVs) in multi-agent systems by propagating security state across departmental boundar…
Zhenhao Xu, Wenhan Chang, Yichuan Chen, Yuxin Fang +2 more
The paper proposes Safety Context Injection (SCI), an inference-time framework that prepends a structured external risk report to protect Large Reasoning Models (LRMs) against sophisticated jailbreaks…
The paper benchmarks current frontier computer-using agents against hand-crafted attacks, finding that while they are highly safe in browser tasks, this safety does not generalize to other domains lik…