~ similar to 2606.05159· 19 results
This paper investigates the robustness of world models in vision-based quadrotor navigation and identifies factors governing their quality.
Tianzhuo Yang, Zihan Shen, Zirui Mi, Zhaoyi Zhang +6 more
The paper introduces MiraBench, a new benchmark that evaluates the action-conditioned reliability of robotic world models, finding that visual fidelity is insufficient and that optimism bias is a perv…
Christian Scherer, Joe Watson, Theo Gruner, Daniel Palenicek +2 more
The paper proposes a coherent inverse reinforcement learning (IRL) method to improve large behavior models for robotic control, achieving superior sample efficiency and performance on complex sparse m…
The paper proposes CTRL-STEER, a closed-loop framework that adaptively adjusts intervention strength to stabilize concept regulation and improve task success in Vision-Language-Action models without r…
Zizhe Chen, Jiqian Dong, Yizhou Tian, Garry Yang +3 more
This paper introduces Numca and Hista, two novel techniques that significantly improve state value estimation for LLM reinforcement learning, addressing the instability of standard critic approaches.
The paper formally addresses the challenging question of cross-domain transferability of latent predictive models by proposing a structured framework that quantifies the relationship between source an…
Oussama Zaim, Mélodie Daniel, Aly Magassouba, Miguel Aranda +1 more
The paper proposes a robust sim-to-sim-to-real DRL approach to enable double-Ackermann robots to achieve full pose control despite significant actuation uncertainties and discrepancies between simulat…
The paper introduces a diagnostic framework to determine if World-Action Models (WAMs) provide genuinely actionable behavioral improvements beyond simply achieving task success, finding that WAMs ofte…
Dong Jing, Jingchen Nie, Tianqi Zhang, Jiaqi Liu +3 more
TempoVLA is a novel Vision-Language-Action model that enables controllable execution speed for robot manipulation by explicitly conditioning the policy on the desired speed.
This paper provides the first non-vacuous generalization analysis for the Stochastic Variance Reduced Gradient (SVRG) method by establishing sharp, data-dependent algorithmic stability bounds, thereby…
Steven Oh, Jason Jingzhou Liu, Tony Tao, Philip Han +4 more
This paper presents a data-driven method to estimate external joint torques without dedicated force sensors, enabling force-feedback teleoperation on low-cost arms.
Chenhao Bai, Liqin Lu, Kaijun Wang, Hui Chen +4 more
This paper studies how to scale robust robot policies by expanding physical domains in a recoverable way.
The paper introduces using frozen, generalist value functions as differentiable surrogates to efficiently optimize and analyze new multi-embodiment robot designs without requiring repeated reinforceme…
This paper demonstrates that Large Language Models (LLMs) can serve as accurate and selective surrogates for costly GPU kernel performance measurements, significantly expanding the search space for op…
Zhongxi Chen, Yifan Han, Yanming Shao, Huanming Liu +4 more
BORA is an offline-to-online RL framework that enhances dexterous VLA models for real-world robotics by using an action-conditioned critic and a lightweight residual adaptation mechanism to correct ex…
The paper introduces Posterior Hybrid Bayesian Belief (PhyB), a novel framework that reformulates policy optimization in Bayesian Offline RL by approximating expectations as a convex combination over…
The paper proposes an iCEM+TL framework that combines the Sample-efficient Cross-Entropy Method with Transfer Learning and Reward Redesign to improve robotic motion planning for complex tasks like sta…
Martin Schuck, Marcel P. Rath, Yufei Hua, AbhisheK Goudar +2 more
Crazyflow is a novel, highly accelerated, and differentiable drone simulator that provides a unified platform for generating large-scale synthetic data for aerial robotics, enabling advanced training…
Jinhe Bi, Aniri, Minglai Yang, Xingcheng Zhou +8 more
EchoRL proposes a lightweight module to exploit valuable learning signals from advantage-degenerated rollouts in Reinforcement Learning with Verifiable Rewards (RLVR), significantly improving LLM post…