~ similar to 2606.06423· 20 results
Xinyi Ning, Zilin Bian, Dachuan Zuo, Semiha Ergan +1 more
The paper proposes a Risk Horizon Profiling (RHP) module that uses a continuous potential field model to profile future risk distributions, significantly improving trajectory prediction accuracy in bo…
Qingwen Pu, Kun Xie, Hong Yang, Di Yang +1 more
The paper develops a novel deep reinforcement learning framework, SMamba-DDPG, to accurately model vehicle-type-specific pedestrian crash avoidance behavior, finding that pedestrians react faster and…
SARAD proposes a novel safety-aware hybrid framework that combines Large Language Models (LLMs) and Deep Reinforcement Learning (DRL) to improve autonomous driving decision-making by replacing random…
The paper introduces a stealthy, scenario-realistic data fabrication attack that subtly manipulates object poses in shared perception data to induce unsafe driving behaviors in connected and autonomou…
This paper surveys the risks associated with world models, proposing a unified threat model and demonstrating adversarial attacks that show world models require rigorous safety standards comparable to…
This paper demonstrates that reasoning-enabled Vision-Language-Action (VLA) models for autonomous driving are highly vulnerable to realistic input perturbations, significantly compromising both reason…
The paper proposes SafeDIG, a robust safety steering framework that adapts Diffusion Transformers for text-to-image generation by treating safety control as position-aware sparse feature transfer, ens…
The paper introduces an uncertainty-aware framework that uses regulated expert advice to guide safe and efficient exploration for autonomous driving policies, significantly improving performance in co…
Dongrui Liu, Yu Li, Zhonghao Yang, Peng Wang +46 more
The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex open-world agent deployments.
Dongrui Liu, Yu Li, Zhonghao Yang, Peng Wang +46 more
The paper introduces AgentDoG 1.5, a lightweight and scalable alignment framework that significantly improves AI agent safety and security for complex, open-world agentic scenarios.
Dongwook Choi, Taeyoon Kwon, Bogyung Jeong, Minju Kim +5 more
EMBGuard introduces a novel, MLLM-based safety guardrail that explicitly identifies and explains physical hazards from (visual observation, action) pairs, enabling safer planning for embodied agents.
This paper proposes a systematic joint workflow combining HARA and TARA to comprehensively identify and analyze risks stemming from inherent limitations of Deep Neural Networks (DNNs) used in autonomo…
Yusong Zhao, Yuejin Xie, Youliang Yuan, Junjie Hu +3 more
The paper introduces PaSBench-Video, a comprehensive streaming video benchmark designed to rigorously test multimodal LLMs' ability to issue proactive safety warnings, finding that current models stru…
Zhepei Hong, Lin Wang, Liting Li, Haokai Ma +4 more
The paper proposes TRACE, a trajectory risk-aware compression method, to effectively aggregate sparse and delayed safety evidence across long agent trajectories, achieving state-of-the-art performance…
Zhenhao Xu, Wenhan Chang, Yichuan Chen, Yuxin Fang +2 more
The paper proposes Safety Context Injection (SCI), an inference-time framework that prepends a structured external risk report to protect Large Reasoning Models (LRMs) against sophisticated jailbreaks…
Qi Hu, Yifeng Tang, Qinghua Wang, Lanyang Zhao +6 more
The paper introduces SABER, a new benchmark that evaluates the operational safety of LLM coding agents in complex, stateful project environments, finding that current models have a high rate of harmfu…
Xian Qi Loye, Qinglin Su, Zhexin Zhang, Shiyao Cui +4 more
The paper introduces RUBAS, a rubric-based reinforcement learning framework that improves agent safety by providing fine-grained, multi-dimensional rewards for complex tool-use scenarios.
Fanxiao Li, Jiaying Wu, Tingchao Fu, Natasha Jaques +2 more
The paper introduces FlowSteer, a prompt-only attack that exploits vulnerabilities in how multi-agent LLM systems plan workflows, significantly increasing the success rate of malicious signal propagat…
The paper demonstrates that fine-tuning safety guard models on benign data can catastrophically collapse their safety alignment, proposing Fisher-Weighted Safety Subspace Regularization (FW-SSR) to ac…
The paper introduces BOA, a novel framework that measures agent safety by exhaustively searching the entire in-budget trajectory space, thereby identifying unsafe behaviors missed by traditional sampl…