ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2606.05143· 19 results

cs.LGRecentJun 1, 2026

Coherent Off-Policy Improvement of Large Behavior Models with Learned Rewards

Christian Scherer, Joe Watson, Theo Gruner, Daniel Palenicek +2 more

The paper proposes a coherent inverse reinforcement learning (IRL) method to improve large behavior models for robotic control, achieving superior sample efficiency and performance on complex sparse m…

View →
cs.AIRecentMay 28, 2026

Physically Viable World Models: A Case for Query-Conditioned Embodied AI

Adam J. Thorpe, Stepan Tretiakov, Cheng-Hsi Hsiao, Su Ann Low +5 more

The paper argues that for embodied AI to be safe and effective, world models must be physically viable, requiring a structural shift from mere observation prediction to representing the underlying phy…

View →
cs.AIcs.LGRecentMay 29, 2026

From Noise to Control: Parameterized Diffusion Policies

Renhao Zhang, Haotian Fu, Mingxi Jia, George Konidaris +2 more

The Parameterized Diffusion Policy (PDP) framework transforms diffusion models from general stochastic generators into precise, steerable tools for learning and adapting complex robotic behaviors by e…

View →
cs.AIRecentMay 28, 2026

MiraBench: Evaluating Action-Conditioned Reliability in Robotic World Models

Tianzhuo Yang, Zihan Shen, Zirui Mi, Zhaoyi Zhang +6 more

The paper introduces MiraBench, a new benchmark that evaluates the action-conditioned reliability of robotic world models, finding that visual fidelity is insufficient and that optimism bias is a perv…

View →
cs.ROcs.AIRecentMay 27, 2026

Visualizing Latent Phase Structures in Locomotion Policies: A Multi-Environment Study with Temporal Feature Extension

Daisuke Yasui, Toshitaka Matuki, Hiroshi Sato

The paper proposes a novel framework to visualize and uncover latent, structured motion phases in deep reinforcement learning locomotion policies by augmenting state observations with action and next-…

View →
cs.AIcs.LGcs.LORecentMay 29, 2026

Robust Shielding for Safe Reinforcement Learning

Edwin Hamel-De le Court, Thom Badings, Alessandro Abate, Francesco Belardinelli +1 more

The paper introduces a novel shielding framework for Robust MDPs (RMDPs) that guarantees safety under worst-case transition probabilities, enabling safe reinforcement learning even when transition dyn…

View →
cs.ROcs.AIcs.CVRecentMay 27, 2026

Turning Video Models into Generalist Robot Policies

Sizhe Lester Li, Evan Kim, Xingjian Bai, Tong Zhao +3 more

The paper proposes VERA, a decoupled policy that uses an action-free video world model combined with an embodiment-specific Inverse Dynamics Model (IDM) to achieve generalizable, zero-shot robot contr…

View →
cs.ROcs.CVRecentJun 1, 2026

RoboDream: Compositional World Models for Scalable Robot Data Synthesis

Junjie Ye, Rong Xue, Basile Van Hoorick, Runhao Li +5 more

RoboDream introduces an embodiment-centric world model that synthesizes photorealistic, physically feasible robot demonstrations by decoupling motion generation from environment synthesis, significant…

View →
cs.RORecentJun 3, 2026

X4Val: Learning Neural Surrogates for Variance-Reduced Policy Evaluation

Rachel Luo, Michael Watson, Apoorva Sharma, Heng Yang +5 more

This paper introduces X4Val, a framework for variance-reduced real-world metric estimation using non-paired, multi-domain data.

View →
cs.AIRecentMay 30, 2026

Decoupled Behavioral Cloning for Scalable Inductive Generalization in RL from Specifications

Vignesh Subramanian, Subhajit Roy, Suguman Bansal

The paper proposes DIBS, a decoupled behavioral cloning approach that stabilizes inductive generalization in RL by separating task-specific policy learning from the evolution function, leading to impr…

View →
cs.RORecentJun 3, 2026

Generalization of World Models under Environmental Variability for Vision-based Quadrotor Navigation

Luca Zanatta, Grzegorz Malczyk, Kostas Alexis

This paper investigates the robustness of world models in vision-based quadrotor navigation and identifies factors governing their quality.

View →
cs.LGcs.AIRecentMay 30, 2026

Behavior-Invariant Task Representation Learning with Transformer-based World Models for Offline Meta-Reinforcement Learning

Fuyuan Qian, Menglong Zhang, Song Wang, Quanying Liu

The paper proposes a novel framework combining behavior-invariant task representation learning and a Transformer-based world model to achieve robust generalization in offline meta-reinforcement learni…

View →
cs.AIRecentJun 1, 2026

TERRA: Task-Embedded Reasoning and Representation Architecture for Cross-Domain Applications

Shayan Shokri

The paper formally addresses the challenging question of cross-domain transferability of latent predictive models by proposing a structured framework that quantifies the relationship between source an…

View →
cs.CRRecentJun 2, 2026

Same Weights, Different Robot: A Deployment Safety View of VLA Policies

Jianwei Tai

The paper identifies a 'deployment-safety gap' in Vision-Language-Action (VLA) policies, showing that identical model checkpoints can result in physically different and unsafe robot actions due to act…

View →
cs.ROcs.AIRecentMay 30, 2026

Shape Your Body: Value Gradients for Multi-Embodiment Robot Design

Nico Bohlinger, Jan Peters

The paper introduces using frozen, generalist value functions as differentiable surrogates to efficiently optimize and analyze new multi-embodiment robot designs without requiring repeated reinforceme…

View →
cs.CRcs.AIcs.CVRecentMar 28, 2026

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

Xiao Li, Xiang Zheng, Yifeng Gao, Xinyu Xia +34 more

This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust,…

View →
cs.LGcs.AImath.NARecentMay 27, 2026

Hybrid Neural World Models

Pranav Lakshmanan, Paras Chopra

The paper introduces hybrid neural world models that provide fast, multi-horizon predictions for complex physical dynamics, implicitly handling sharp events like shocks and contacts without explicit t…

View →
cs.LGcs.CLcs.CRRecentMay 14, 2026

LiSA: Lifelong Safety Adaptation via Conservative Policy Induction

Minbeom Kim, Lesly Miculicich, Bhavana Dalvi Mishra, Mihir Parmar +5 more

LiSA introduces a conservative policy induction framework that enhances fixed AI guardrails by converting sparse, noisy failure reports into reusable, generalized policies, significantly improving saf…

View →
cs.LGcs.CLRecentMay 31, 2026

OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification

Yuhang Zhou, Lizhu Zhang, Yifan Wu, Mingyi Wang +4 more

OmniOPD introduces a logit-free, chunk-level distillation framework that improves on standard On-Policy Distillation by using semantic similarity and peak-entropy scheduling, achieving state-of-the-ar…

View →