~ similar to 2606.06493· 19 results
The paper proposes CTRL-STEER, a closed-loop framework that adaptively adjusts intervention strength to stabilize concept regulation and improve task success in Vision-Language-Action models without r…
Junjie Ye, Rong Xue, Basile Van Hoorick, Runhao Li +5 more
RoboDream introduces an embodiment-centric world model that synthesizes photorealistic, physically feasible robot demonstrations by decoupling motion generation from environment synthesis, significant…
Xiao Li, Xiang Zheng, Yifeng Gao, Xinyu Xia +34 more
This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust,…
The paper proposes SWIM, a novel imitation learning method that can synthesize physically-based swimming motions from a single example, demonstrating superior data efficiency and generalization across…
The paper introduces using frozen, generalist value functions as differentiable surrogates to efficiently optimize and analyze new multi-embodiment robot designs without requiring repeated reinforceme…
Zhen Huang, Zhihuang Liu, Mengxuan Luo, Weishang Wu +1 more
The paper proposes a novel attack paradigm demonstrating how compromising a single robot in an LLM-controlled multi-robot system can rapidly propagate malicious intent to cause coordinated unsafe acti…
The paper proposes Completion at the Boundary (CaB), a method that predicts event-local completion objects using Boundary-Phase Tokens to improve the reliability of multi-step, handoff-heavy tasks for…
Muyuan Ma, Houcheng Li, Haotian Zhai, Lijun Han +3 more
The paper proposes a simulation-trained variable impedance control framework for wearable exoskeletons that safely and effectively augments human physical capabilities across multiple tasks.
The paper proposes an iCEM+TL framework that combines the Sample-efficient Cross-Entropy Method with Transfer Learning and Reward Redesign to improve robotic motion planning for complex tasks like sta…
Beichen Shao, Mengying Xie, Heng Su, Wanyi Zhang +4 more
GSAM introduces a generalizable and safe robotic framework for articulated object manipulation, significantly improving success rates and reducing variability across diverse tasks by integrating commo…
The paper introduces Center-of-Pressure (CoP), a physics-grounded tactile representation that enables robust zero-shot sim-to-real transfer for complex, contact-rich manipulation tasks.
Qiuyue Wang, Mingsheng Li, Jian Guan, Jinhui Ye +36 more
Qwen-VLA introduces a unified embodied foundation model that extends vision-language understanding to continuous action generation, enabling robust, multi-task generalization across diverse robotic ta…
Zhongxi Chen, Yifan Han, Yanming Shao, Huanming Liu +4 more
BORA is an offline-to-online RL framework that enhances dexterous VLA models for real-world robotics by using an action-conditioned critic and a lightweight residual adaptation mechanism to correct ex…
Ruoxuan Zhang, Qiaoqiao Wan, Zhengguang Wang, Chenghao Yu +3 more
The paper introduces MindClaw, a closed-loop framework that enables embodied agents to perform real-time mental-state reasoning and intervene with precision, significantly outperforming standard VLM b…
Tianyi Xie, Haotian Zhang, Jinhyung Park, Zi Wang +16 more
This paper presents GRAIL, a digital generation pipeline that synthesizes human-object interactions for humanoid robots.
The paper introduces TouchSafeBench, a physics-grounded benchmark, to evaluate collision grounding—the ability to predict robot-human collisions—and finds that current Vision-Language Models (VLMs) ar…
PhyGenHOI introduces a novel framework that generates physically accurate and visually faithful 4D Human-Object Interactions by coupling generative human motion with explicit physical object simulatio…
Dongwook Choi, Taeyoon Kwon, Bogyung Jeong, Minju Kim +5 more
EMBGuard introduces a novel, MLLM-based safety guardrail that explicitly identifies and explains physical hazards from (visual observation, action) pairs, enabling safer planning for embodied agents.
Shengyu Si, Yuanzhuo Lu, Ruimeng Yang, Ziyi Ye +2 more
VLA-Pro is a plug-and-play framework that enhances cross-task generalization in Vision-Language-Action models by storing and dynamically retrieving task-specific procedural memories, achieving signifi…