Papers similar to 2605.29114v1

~ similar to 2605.29114v1· 19 results

cs.CRcs.CVRecentMay 12, 2026

Still Camouflage, Moving Illusion: View-Induced Trajectory Manipulation in Autonomous Driving

Shuo Ju, Qingzhao Zhang, Huashan Chen, Xuheng Wang +5 more

The paper introduces a novel adversarial attack that uses static, view-dependent camouflage on a vehicle to induce consistent feature drift, causing autonomous systems to predict false, yet plausible,…

View →

cs.CRcs.AIcs.RORecentMar 24, 2026

TRAP: Hijacking VLA CoT-Reasoning via Adversarial Patches

Zhengxian Huang, Wenjun Zhu, Haoxuan Qiu, Xiaoyu Ji +1 more

This paper introduces TRAP, an adversarial attack that demonstrates how physical patches can hijack the Chain-of-Thought (CoT) reasoning process in Vision-Language-Action (VLA) models, forcing them to…

View →

cs.CRRecentMay 12, 2026

Safety Context Injection: Inference-Time Safety Alignment via Static Filtering and Agentic Analysis

Zhenhao Xu, Wenhan Chang, Yichuan Chen, Yuxin Fang +2 more

The paper proposes Safety Context Injection (SCI), an inference-time framework that prepends a structured external risk report to protect Large Reasoning Models (LRMs) against sophisticated jailbreaks…

View →

cs.CRRecentMay 2, 2026

From Stealthy Data Fabrication to Unsafe Driving: Realistic Scenario Attacks on Collaborative Perception

Qingzhao Zhang, Runting Zhang, Z. Morley Mao

The paper introduces a stealthy, scenario-realistic data fabrication attack that subtly manipulates object poses in shared perception data to induce unsafe driving behaviors in connected and autonomou…

View →

cs.CVcs.CRcs.LGRecentApr 30, 2026

Understanding Adversarial Transferability in Vision-Language Models for Autonomous Driving: A Cross-Architecture Analysis

David Fernandez, Pedram MohajerAnsari, Amir Salarpour, Mert D. Pese

This paper systematically analyzes the high cross-architecture transferability of physical adversarial attacks on Vision-Language Models (VLMs) used in autonomous driving, demonstrating that attacks e…

View →

cs.AIcs.CRRecentMay 28, 2026

Provably Secure Agent Guardrail

Benlong Wu, Weiming Zhang, Kejiang Chen, Han Fang +1 more

The paper introduces an executable Proof-Constrained Action (ePCA) framework that secures AI agents by forcing them to formalize their intentions into first-order logical constraints, achieving provab…

View →

cs.AIcs.CRRecentMay 28, 2026

Provably Secure Agent Guardrail

Benlong Wu, Weiming Zhang, Kejiang Chen, Han Fang +1 more

The paper introduces a formal, logically constrained framework, ePCA, to secure advanced AI agents by forcing them to translate natural language intentions into first-order logical constraints before…

View →

cs.AIcs.CRRecentMar 26, 2026

Beyond Content Safety: Real-Time Monitoring for Reasoning Vulnerabilities in Large Language Models

Xunguang Wang, Yuguang Zhou, Qingyue Wang, Zongjie Li +4 more

This paper introduces a novel framework, the Reasoning Safety Monitor, to detect and prevent logical inconsistencies and adversarial manipulations within the internal reasoning steps of large language…

View →

cs.CRRecentMay 8, 2026

Membership Inference Attacks on Vision-Language-Action Models

Yuefeng Peng, Mingzhe Li, Kejing Xia, Renhao Zhang +1 more

This paper presents the first systematic study of membership inference attacks (MIAs) against Vision-Language-Action (VLA) models, demonstrating that these models are highly vulnerable to privacy brea…

View →

cs.CRcs.AIcs.LGRecentApr 1, 2026

Safety, Security, and Cognitive Risks in World Models

Manoj Parmar

This paper surveys the risks associated with world models, proposing a unified threat model and demonstrating adversarial attacks that show world models require rigorous safety standards comparable to…

View →

cs.CLRecentMay 29, 2026

ConsisGuard: Aligning Safety Deliberation with Policy Enforcement in LLM Guardrails

Yan Wang, Zhixuan Chu, Zihao Xue, Zhen Bi +8 more

The paper introduces ConsisGuard, a framework that addresses the 'deliberation-to-enforcement gap' in LLM guardrails by ensuring that the reasoning process is faithfully and consistently translated in…

View →

cs.AIcs.CLcs.CRRecentMay 27, 2026

Robust and Efficient Guardrails with Latent Reasoning

Siddharth Sai, Xiaofei Wen, Muhao Chen

The paper introduces COLAGUARD, a novel guardrail model that efficiently transfers multi-step safety reasoning into a continuous latent space, achieving state-of-the-art safety performance with massiv…

View →

cs.AIcs.CLcs.CRRecentMay 27, 2026

Robust and Efficient Guardrails with Latent Reasoning

Siddharth Sai, Xiaofei Wen, Muhao Chen

The paper introduces COLAGUARD, a novel guardrail model that efficiently transfers multi-step safety reasoning into a continuous latent space, achieving high safety performance with massive improvemen…

View →

cs.ROcs.AIcs.LGRecentMay 29, 2026

Continuous Reasoning for Vision-Language-Action

Yueh-Hua Wu, Tatsuya Matsushima, Kei Ota

The paper proposes Continuous Reasoning for Vision-Language-Action (VLA) models, arguing that effective reasoning must be a shared, verifiable internal latent space rather than discrete text tokens, l…

View →

cs.CRcs.AIRecentApr 10, 2026

Conflicts Make Large Reasoning Models Vulnerable to Attacks

Honghao Liu, Chengjin Xu, Xuhui Jiang, Cehao Yang +4 more

The paper demonstrates that confronting Large Reasoning Models (LRMs) with conflicting objectives, such as contradictory choices or conflicting alignment values, significantly increases their vulnerab…

View →

cs.CVcs.AIRecentMay 29, 2026

Does Visual Information Play a Decisive Role in Vision-Language-Action Model Driving Behavior?

Jingtao He, Hongliang Lu, Xiaoyun Qiu, Yixuan Wang +1 more

The paper introduces a structured multi-level visual perturbation framework to systematically analyze how dependent VLA-based driving behavior is on visual information, revealing uneven visual groundi…

View →

cs.CRcs.AIRecentApr 14, 2026

Parallax: Why AI Agents That Think Must Never Act

Joel Fokou

The paper introduces Parallax, an architectural framework that structurally separates AI reasoning from action execution to ensure robust safety for autonomous agents, achieving high attack mitigation…

View →

cs.CVcs.AIcs.CLRecentMay 29, 2026

SpatialAct: Probing Spatial Reasoning-to-Action Capabilities of VLM Agents in 3D Scenes

Tianhui Liu, Jie Feng, Zhiheng Zheng, Shengyuan Wang +5 more

The paper introduces SpatialAct, a challenging benchmark that reveals a significant 'reasoning-to-action gap,' showing that current VLMs struggle to maintain coherent spatial understanding and perform…

View →

cs.ROcs.AIcs.LGRecentJun 1, 2026

Permissive Safety Through Trusted Inference: Verifiable Belief-Space Neural Safety Filters for Assured Interactive Robotics

Haimin Hu

The paper proposes an algorithmic method using conformal prediction to formally certify high-probability safety for Belief-Space Neural Safety Filters (BeliefSF), significantly improving safety guaran…

View →