~ similar to 2605.31120· 19 results
Junjie Ye, Rong Xue, Basile Van Hoorick, Runhao Li +5 more
RoboDream introduces an embodiment-centric world model that synthesizes photorealistic, physically feasible robot demonstrations by decoupling motion generation from environment synthesis, significant…
Tianyi Xie, Haotian Zhang, Jinhyung Park, Zi Wang +16 more
This paper presents GRAIL, a digital generation pipeline that synthesizes human-object interactions for humanoid robots.
PhyGenHOI introduces a novel framework that generates physically accurate and visually faithful 4D Human-Object Interactions by coupling generative human motion with explicit physical object simulatio…
The paper introduces using frozen, generalist value functions as differentiable surrogates to efficiently optimize and analyze new multi-embodiment robot designs without requiring repeated reinforceme…
Lizhi Yang, Junheng Li, Nehar Poddar, Yiling Hou +4 more
This paper proposes a compact, explicit interface for humanoid robots that enables diverse manipulation skills and demonstrates its feasibility through natural-language-driven task roll-outs.
Inhee Lee, Sangwon Baik, Sungjoo Kim, Hyeonwoo Kim +2 more
SimuScene introduces a novel compositional 3D reconstruction pipeline that integrates physics simulation directly into the shape and layout estimation process to generate stable, simulation-ready 3D s…
T2Mo is a novel framework that generates controllable dynamic 3D object shapes by combining explicit 3D trajectories for spatial guidance with natural language text semantics.
Qiuyue Wang, Mingsheng Li, Jian Guan, Jinhui Ye +36 more
Qwen-VLA introduces a unified embodied foundation model that extends vision-language understanding to continuous action generation, enabling robust, multi-task generalization across diverse robotic ta…
Ben Wang, Xiaogang Li, Ruochen Gao, Peiyao Xiao +5 more
The paper introduces BilliardPhys-Bench, a new benchmark that demonstrates that current multimodal LLMs struggle with complex physical reasoning and predicting object dynamics in simulated environment…
Adam J. Thorpe, Stepan Tretiakov, Cheng-Hsi Hsiao, Su Ann Low +5 more
The paper argues that for embodied AI to be safe and effective, world models must be physically viable, requiring a structural shift from mere observation prediction to representing the underlying phy…
Chenhao Bai, Liqin Lu, Kaijun Wang, Hui Chen +4 more
This paper studies how to scale robust robot policies by expanding physical domains in a recoverable way.
The paper introduces SPAWN, a training-free method that allows users to inject specified visual concepts into existing autoregressive world models, enabling controllable scene composition beyond the i…
Zekun Qi, Xuchuan Chen, Dairu Liu, Chenghuai Lin +9 more
The paper introduces Humanoid-GPT, a large-scale generative Transformer model that achieves robust zero-shot motion tracking and control by training on a massive, unified corpus of motion data.
This paper presents Mana, a sim-to-real framework for dexterous articulated tool manipulation.
The paper proposes a novel framework to visualize and uncover latent, structured motion phases in deep reinforcement learning locomotion policies by augmenting state observations with action and next-…
Sizhe Lester Li, Evan Kim, Xingjian Bai, Tong Zhao +3 more
The paper proposes VERA, a decoupled policy that uses an action-free video world model combined with an embodiment-specific Inverse Dynamics Model (IDM) to achieve generalizable, zero-shot robot contr…
The paper proposes a novel cross-axis feature fusion architecture and an auxiliary joint-difference prediction task to significantly improve text-based 3D human motion editing by better understanding…
CubePart is a generative framework that enables the creation of complex 3D meshes by explicitly controlling and generating individual, semantically defined parts based on open-vocabulary text prompts.
Jingyun Liang, Min Wei, Shikai Li, Yizeng Han +4 more
The paper proposes a novel render-free framework that conditions video diffusion models directly on compressed 3D human mesh tokens, enabling robust 3D-aware human motion control without relying on re…