~ similar to 2605.29155· 18 results
The paper proposes a Network Distributed Multi-Agent Reinforcement Learning (ND-MARL) framework that enables stable, scalable consensus control for large swarms of quadcopters using only local neighbo…
The paper introduces a Variational Encrypted Model Predictive Control (VEMPC) protocol that enables online MPC execution using only encrypted polynomial operations, eliminating the need for intermedia…
Yifei He, Rui Yang, Hao Bai, Tong Zhang +1 more
PRO-CUA introduces a process-reward optimization framework that enables efficient, step-level reinforcement learning for training computer use agents by decoupling environment interaction from policy…
The paper introduces AgenticRL, a self-refining reinforcement learning framework that uses a multimodal GPT agent to automatically design, refine, and deploy reward functions for complex UAV navigatio…
The paper proposes a scalable, distributed approach for constrained Multi-Agent Reinforcement Learning by using local consensus over dual variables to ensure global constraint satisfaction without cen…
Martin Schuck, Marcel P. Rath, Yufei Hua, AbhisheK Goudar +2 more
Crazyflow is a novel, highly accelerated, and differentiable drone simulator that provides a unified platform for generating large-scale synthetic data for aerial robotics, enabling advanced training…
The paper proposes DNQ, a scalable solver-in-the-loop framework for training agents in multi-turn simultaneous bidding games by leveraging pairwise payoff estimation to approximate complex equilibrium…
The paper proposes CTRL-STEER, a closed-loop framework that adaptively adjusts intervention strength to stabilize concept regulation and improve task success in Vision-Language-Action models without r…
The paper introduces NASimJax, a GPU-accelerated framework that significantly speeds up network simulation for reinforcement learning, enabling large-scale, realistic training for penetration testing.
Dong Jing, Jingchen Nie, Tianqi Zhang, Jiaqi Liu +3 more
TempoVLA is a novel Vision-Language-Action model that enables controllable execution speed for robot manipulation by explicitly conditioning the policy on the desired speed.
The paper introduces Prompted Policy Optimization (PromptPO), an LLM-based method that successfully optimizes policies for various sequential RL tasks, demonstrating that LLMs can replace classical RL…
This paper demonstrates that Large Language Models (LLMs) can serve as accurate and selective surrogates for costly GPU kernel performance measurements, significantly expanding the search space for op…
The paper proposes an energy-efficient drag reduction strategy for turbulent flows by combining Multi-Agent Deep Reinforcement Learning with SHAP-guided explainable deep learning, achieving superior p…
Oussama Zaim, Mélodie Daniel, Aly Magassouba, Miguel Aranda +1 more
The paper proposes a robust sim-to-sim-to-real DRL approach to enable double-Ackermann robots to achieve full pose control despite significant actuation uncertainties and discrepancies between simulat…
The paper proposes Multi-Agent Computer Use (MACU) systems, which significantly improve performance on complex, long-horizon tasks by enabling parallel execution and dynamic task decomposition compare…
The paper introduces Coordination Graphs for Constrained Multi-Agent Reinforcement Learning (CG-CMARL), a scalable framework that decomposes complex joint action spaces into pairwise regions to handle…
Lichao Wang, Zhaoxing Ren, Tianzhuo Yang, Jiaming Ji +3 more
SafeMCP is a server-side defense plugin that uses look-ahead reasoning to proactively filter and constrain tool acquisition for LLM agents, thereby mitigating catastrophic risks associated with expand…
Sizhe Lester Li, Evan Kim, Xingjian Bai, Tong Zhao +3 more
The paper proposes VERA, a decoupled policy that uses an action-free video world model combined with an embodiment-specific Inverse Dynamics Model (IDM) to achieve generalizable, zero-shot robot contr…