Robotics
Robot learning, manipulation, navigation, embodied AI
20 papers indexed
Turning Video Models into Generalist Robot Policies
Sizhe Lester Li, Evan Kim, Xingjian Bai, Tong Zhao +3 more
The paper proposes VERA, a decoupled policy that uses an action-free video world model combined with an embodiment-specific Inverse Dynamics Model (IDM) to achieve generalizable, zero-shot robot contr…
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments
Qiuyue Wang, Mingsheng Li, Jian Guan, Jinhui Ye +36 more
Qwen-VLA introduces a unified embodied foundation model that extends vision-language understanding to continuous action generation, enabling robust, multi-task generalization across diverse robotic ta…
Mana: Dexterous Manipulation of Articulated Tools
This paper presents Mana, a sim-to-real framework for dexterous articulated tool manipulation.
Probing Collision Grounding in Vision-Language Models for Safe Human-Robot Collaboration
The paper introduces TouchSafeBench, a physics-grounded benchmark, to evaluate collision grounding—the ability to predict robot-human collisions—and finds that current Vision-Language Models (VLMs) ar…
Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses
Xiao Li, Xiang Zheng, Yifeng Gao, Xinyu Xia +34 more
This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust,…
HANDOFF: Humanoid Agentic Task-Space Whole-Body Control via Distilled Complementary Teachers
Lizhi Yang, Junheng Li, Nehar Poddar, Yiling Hou +4 more
This paper proposes a compact, explicit interface for humanoid robots that enables diverse manipulation skills and demonstrates its feasibility through natural-language-driven task roll-outs.
Sample-efficient Low-level Motion Planning for Robotic Manipulation Tasks via Zero-shot Transfer Learning
The paper proposes an iCEM+TL framework that combines the Sample-efficient Cross-Entropy Method with Transfer Learning and Reward Redesign to improve robotic motion planning for complex tasks like sta…
RoboDream: Compositional World Models for Scalable Robot Data Synthesis
Junjie Ye, Rong Xue, Basile Van Hoorick, Runhao Li +5 more
RoboDream introduces an embodiment-centric world model that synthesizes photorealistic, physically feasible robot demonstrations by decoupling motion generation from environment synthesis, significant…
TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies
Dong Jing, Jingchen Nie, Tianqi Zhang, Jiaqi Liu +3 more
TempoVLA is a novel Vision-Language-Action model that enables controllable execution speed for robot manipulation by explicitly conditioning the policy on the desired speed.
Extreme dynamic symmetry enables omnidirectional and multifunctional robots
The paper introduces and demonstrates that leveraging dynamic symmetry—the uniformity of attainable center-of-mass accelerations—significantly enhances a robot's agility, robustness, and multifunction…
FACTR 2: Learning External Force Sensing for Commodity Robot Arms Improves Policy Learning
Steven Oh, Jason Jingzhou Liu, Tony Tao, Philip Han +4 more
This paper presents a data-driven method to estimate external joint torques without dedicated force sensors, enabling force-feedback teleoperation on low-cost arms.
AFUN: Towards an Affordance Foundation Model for Functionality Understanding
Zhaoning Wang, Yi Zhong, Jiawei Fu, Henrik I. Christensen +1 more
The paper introduces AFUN, a model that predicts both the location (functional mask) and the motion (3D curve) for robot interaction, aiming to create a generalizable foundation model for understandin…
GRAIL: Generating Humanoid Loco-Manipulation from 3D Assets and Video Priors
Tianyi Xie, Haotian Zhang, Jinhyung Park, Zi Wang +16 more
This paper presents GRAIL, a digital generation pipeline that synthesizes human-object interactions for humanoid robots.
Propagating Unsafe Actions in LLM Controlled Multi-Robot Collaboration via Single Robot Compromise
Zhen Huang, Zhihuang Liu, Mengxuan Luo, Weishang Wu +1 more
The paper proposes a novel attack paradigm demonstrating how compromising a single robot in an LLM-controlled multi-robot system can rapidly propagate malicious intent to cause coordinated unsafe acti…
Not What You Asked For: Typographic Attacks in Household Robot Manipulation
This paper demonstrates that typographic attacks pose a significant, measurable, and physically consequential threat to household robot manipulation systems by causing the robot to grasp and transport…
RoboWits: Unexpected Challenges for Robotic Creative Problem Solving
Chunru Lin, Hongxin Zhang, Fenghao Yu, Zhehuan Chen +4 more
The paper introduces RoboWits, a new bi-manual robotic benchmark designed to test a robot's cognitive reasoning and adaptability to unexpected challenges, revealing that current Vision-Language-Action…
DIRECT: When and Where Should You Allocate Test-Time Compute in Embodied Planners?
Jadelynn Dao, Milan Ganai, Yasmina Abukhadra, Ajay Sridhar +6 more
This paper introduces DIRECT, a routing framework that allocates test-time compute per prompt to improve the success--cost Pareto frontier for embodied agents.
AI-IoT-Robotics Integration: Survey of Frameworks, Emerging Trends, and the Path Toward Connected Robotics
This survey synthesizes the state-of-the-art in AI-IoT-Robotics integration, proposing a modular architecture and highlighting hybrid SLM-LLM systems as the path toward next-generation Connected Robot…
Crazyflow: An Accurate, GPU-Accelerated, Differentiable Drone Simulator in JAX
Martin Schuck, Marcel P. Rath, Yufei Hua, AbhisheK Goudar +2 more
Crazyflow is a novel, highly accelerated, and differentiable drone simulator that provides a unified platform for generating large-scale synthetic data for aerial robotics, enabling advanced training…
Phantom Force: Injecting Adversarial Tactile Perceptions into Embodied Intelligence via EMI
This paper investigates a novel vulnerability in tactile sensing by demonstrating that targeted Electromagnetic Interference (EMI) can induce strong, misleading 'phantom forces' in Hall-effect fingert…