Papers similar to 2605.29425

~ similar to 2605.29425· 19 results

cs.CRcs.LGcs.RORecentMay 27, 2026

ReasonBreak: Probing Vulnerabilities in Reasoning-Enabled Vision-Language-Action Models for Autonomous Driving

Mohammadreza Teymoorianfard, Jean-Philippe Monteuuis, Jonathan Petit, Amir Houmansadr

This paper demonstrates that reasoning-enabled Vision-Language-Action (VLA) models for autonomous driving are highly vulnerable to realistic input perturbations, significantly compromising both reason…

View →

cs.ROcs.AIcs.CVRecentMay 31, 2026

DeepIPCv3: Event-Aware Multi-Modal Sensor Fusion for Sudden Pedestrian Crossing Avoidance

Oskar Natan, Andi Dharmawan, Aufaclav Zatu Kusuma Frisky, Jazi Eko Istiyanto +1 more

DeepIPCv3 is a novel multi-modal framework that fuses LiDAR and DVS event streams using cross-modal attention to achieve state-of-the-art, highly reactive avoidance maneuvers for sudden pedestrian cro…

View →

cs.CVcs.AIRecentMay 29, 2026

Does Visual Information Play a Decisive Role in Vision-Language-Action Model Driving Behavior?

Jingtao He, Hongliang Lu, Xiaoyun Qiu, Yixuan Wang +1 more

The paper introduces a structured multi-level visual perturbation framework to systematically analyze how dependent VLA-based driving behavior is on visual information, revealing uneven visual groundi…

View →

cs.CRcs.AIcs.MMRecentApr 9, 2026

Multimodal Reasoning with LLM for Encrypted Traffic Interpretation: A Benchmark

Longgang Zhang, Xiaowei Fu, Fuxiang Huang, Lei Zhang

The paper introduces a new benchmark (BGTD) and a multimodal framework (mmTraffic) that enables explainable, evidence-grounded interpretation of encrypted network traffic using LLMs.

View →

cs.CLcs.AIcs.CVRecentJun 1, 2026

PaSBench-Video: A Streaming Video Benchmark for Proactive Safety Warning

Yusong Zhao, Yuejin Xie, Youliang Yuan, Junjie Hu +3 more

The paper introduces PaSBench-Video, a comprehensive streaming video benchmark designed to rigorously test multimodal LLMs' ability to issue proactive safety warnings, finding that current models stru…

View →

eess.SYcs.CRmath.OCRecentMay 13, 2026

Day-to-Day Traffic Network Modeling under Route-Guidance Misinformation: Endogenous Trust and Resilience in CAV Environments

Eunhan Ka, Satish V. Ukkusuri

The paper develops a trust-aware framework to model how connected vehicles adapt their routing decisions and overall traffic flow when exposed to misinformation, showing that endogenous trust provides…

View →

cs.AIRecentMay 28, 2026

Uncertainty-Aware and Temporally Regulated Expert Advice in Reinforcement Learning for Autonomous Driving

Ahmed Abouelazm, Felix Klingebiel, Philip Schörner, J. Marius Zöllner

The paper introduces an uncertainty-aware framework that uses regulated expert advice to guide safe and efficient exploration for autonomous driving policies, significantly improving performance in co…

View →

cs.CRcs.LGRecentApr 27, 2026

CAN-QA: A Question-Answering Benchmark for Reasoning over In-Vehicle CAN Traffic

Jing Chen, Abhijay Deevi, Onat Gungor, Tajana Rosing

The paper introduces CAN-QA, a novel question-answering benchmark that reformulates CAN traffic analysis from a classification task to a reasoning task, demonstrating that current LLMs struggle with c…

View →

cs.CVcs.LGeess.IVRecentJun 3, 2026

An Open-Source Two-Stage Computer Vision Pipeline for Fine-Grained Vehicle Classification using Vision Transformers

Gandhimathi Padmanaban, Fred Feng

This paper presents an open-source computer vision pipeline for classifying vehicle body types from naturalistic roadway video.

View →

cs.CVRecentJun 1, 2026

Reason-Then-Retrieve for CoVR-R with Structured Edit Prompts and Dense-Sparse Fusion

DongQing Liu, MengShi Qi, HongWei Ji

The paper proposes a zero-shot reason-then-retrieve pipeline using Qwen3.5-27B to solve the challenging task of composed video retrieval (CoVR-R), achieving high performance on both validation and bli…

View →

cs.CLRecentMay 29, 2026

ConsisGuard: Aligning Safety Deliberation with Policy Enforcement in LLM Guardrails

Yan Wang, Zhixuan Chu, Zihao Xue, Zhen Bi +8 more

The paper introduces ConsisGuard, a framework that addresses the 'deliberation-to-enforcement gap' in LLM guardrails by ensuring that the reasoning process is faithfully and consistently translated in…

View →

cs.AIRecentMay 27, 2026

Modeling Vehicle-Type-Specific Pedestrian Crash Avoidance Behavior in Safety-Critical Interactions Using Smooth-Mamba Deep Reinforcement Learning

Qingwen Pu, Kun Xie, Hong Yang, Di Yang +1 more

The paper develops a novel deep reinforcement learning framework, SMamba-DDPG, to accurately model vehicle-type-specific pedestrian crash avoidance behavior, finding that pedestrians react faster and…

View →

cs.ROcs.AIcs.LGRecentMay 27, 2026

Multi-Resolution End-to-End Deep Neural Network for Optimizing Latency-Accuracy Tradeoff in Autonomous Driving

Qitao Weng, Heechul Yun

The paper proposes a multi-resolution end-to-end deep neural network for autonomous driving that dynamically adjusts input resolution to optimize the critical tradeoff between prediction accuracy and…

View →

cs.CRRecentMay 2, 2026

From Stealthy Data Fabrication to Unsafe Driving: Realistic Scenario Attacks on Collaborative Perception

Qingzhao Zhang, Runting Zhang, Z. Morley Mao

The paper introduces a stealthy, scenario-realistic data fabrication attack that subtly manipulates object poses in shared perception data to induce unsafe driving behaviors in connected and autonomou…

View →

cs.AIRecentMay 31, 2026

Large Language Models in Transportation Systems Management and Operations: From Text Reasoning to Multi-modal Decision Support

Siyan Li, Zehao Wang, Jiachen Li, Kanok Boriboonsomsin +2 more

This survey reviews how Large and Multi-modal Language Models (LLMs/MM-LLMs) are being applied to integrate diverse data sources for enhanced decision support in transportation systems management and…

View →

cs.CVcs.CLRecentMay 29, 2026

Attend to Evidence: Evidence-Anchored Spatial Attention Supervision for Multimodal RLVR

Ruina Hu, Chen Wang, Lai Wei, Jionghao Bai +4 more

The paper introduces EASE, a method that enhances multimodal Reinforcement Learning with Verifiable Rewards (RLVR) by providing spatial attention supervision anchored to visual evidence, significantly…

View →

cs.AIRecentMay 27, 2026

Agentic Active Omni-Modal Perception for Multi-Hop Audio-Visual Reasoning

Ke Xu, Yuhao Wang, Ziyang Cheng, Hongcheng Liu +2 more

The paper introduces MOV-Bench, a challenging benchmark for multi-hop audio-visual reasoning, and proposes AOP-Agent, an agentic framework that significantly improves open-source Omni-LLMs' ability to…

View →

cs.CVcs.AIRecentMay 28, 2026

VisualThink-VLA: Visual Intermediate Reasoning for Effective and Low-Latency Vision-Language-Action Policies

Mingjian Gao, Wenqiao Zhang, Yuqian Yuan, Yang Dai +8 more

VISUALTHINK-VLA introduces a visual intermediate-reasoning framework that guides action prediction using compact visual evidence, achieving high accuracy and significantly low latency for real-time Vi…

View →

cs.CRcs.CVRecentMay 12, 2026

Still Camouflage, Moving Illusion: View-Induced Trajectory Manipulation in Autonomous Driving

Shuo Ju, Qingzhao Zhang, Huashan Chen, Xuheng Wang +5 more

The paper introduces a novel adversarial attack that uses static, view-dependent camouflage on a vehicle to induce consistent feature drift, causing autonomous systems to predict false, yet plausible,…

View →