ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2606.06461· 20 results

cs.AIcs.LGstat.MLRecentJun 1, 2026

ReSkill: Reconciling Skill Creation with Policy Optimization in Agentic RL

Zelin He, Haotian Lin, Boran Han, Wei Zhu +5 more

ReSkill is an RL-in-the-loop framework that reconciles skill creation and policy optimization by automatically creating, testing, and refining modular skills alongside the agent's policy learning, lea…

View →
cs.CLRecentJun 1, 2026

HarnessForge: Joint Harness and Policy Evolution for Adaptive Agent Systems

Mingju Chen, Can Lv, Guibin Zhang, Heng Chang +1 more

HarnessForge introduces a meta-adaptive framework that jointly evolves the execution structure (harness) and the reasoning policy of LLM agents, significantly improving overall system performance acro…

View →
cs.CLRecentMay 29, 2026

Skill is Not One-Size-Fits-All: Model-Aware Skill Alignment for LLM Agents

Jianxiang Yu, Jiapeng Zhu, Bochen Lin, Qier Cui +2 more

The paper introduces MASA, a model-aware skill alignment framework that adaptively rewrites general and task-specific skills for LLM agents, achieving superior performance across diverse backbones and…

View →
cs.AIcs.LGRecentJun 1, 2026

SIRI: Self-Internalizing Reinforcement Learning with Intrinsic Skills for LLM Agent Training

Zhongyu He, Yuanfan Li, Fei Huang, Tianyu Chen +8 more

SIRI introduces a self-internalizing reinforcement learning framework that allows LLM agents to autonomously discover and integrate reusable skills directly into their core policy, significantly impro…

View →
cs.ROcs.AIRecentMay 31, 2026

Implicit Drifting Policy: One-Step Action Generation via Conditional Expert Geometry

Zemin Yang, Yaoyu He, Yiming Zhong, Yuhao Zhang +4 more

The Implicit Drifting Policy (IDP) is a novel one-step action generation framework that implicitly enforces trajectory correction constraints by analyzing local expert action geometry, overcoming the…

View →
cs.LGcs.AIRecentMay 29, 2026

Skill Reuse as Compression in Agentic RL

Zhikun Xu, Yu Feng, Jacob Dineen, Taiwei Shi +2 more

The paper proposes ReuseRL, a method that improves agent generalization in Reinforcement Learning by enforcing structural compressibility of successful agent trajectories into reusable skills.

View →
cs.CLcs.AIRecentMay 30, 2026

Skill or Skip? Learning Selective Skill Invocation in Agentic Tasks via Dual-Granularity Preference Learning

Chishui Chen, Jiaye Lin, Te Sun, Junxi Wang +5 more

SelSkill introduces a dual-granularity preference learning framework that treats skill use as a 'skill-or-skip' decision, significantly improving agent performance and execution precision in complex a…

View →
cs.CRcs.AIcs.CLRecentMay 12, 2026

SkillSafetyBench: Evaluating Agent Safety under Skill-Facing Attack Surfaces

Chang Jin, An Wang, Zeming Wei, Kai Wang +6 more

The paper introduces SkillSafetyBench, a comprehensive benchmark demonstrating that agent safety failures often stem from adversarial influences within reusable skills and execution environments, rath…

View →
cs.CRRecentApr 30, 2026

MASCing: Configurable Mixture-of-Experts Behavior via Activation Steering Masks

Jona te Lintelo, Lichao Wu, Marina Krček, Sengim Karayalçin +1 more

MASCing is a novel framework that enables flexible, non-retraining reconfiguration of Mixture-of-Experts (MoE) models for specific safety objectives by applying activation steering masks to control ex…

View →
cs.LGcs.AIRecentJun 1, 2026

Adaptive Auto-Harness: Sustained Self-Improvement for Agentic System Deployment on Open-Ended Task Streams

Zewen Liu, Zhan Shi, Yisi Sang, Bing He +6 more

Adaptive Auto-Harness introduces a framework that enables LLM agents to sustain self-improvement and maintain high performance over open-ended, shifting task streams, outperforming existing fixed-benc…

View →
cs.AIcs.CRRecentMay 5, 2026

Redefining AI Red Teaming in the Agentic Era: From Weeks to Hours

Raja Sekhar Rao Dheekonda, Will Pearce, Nick Landers

The paper introduces an AI red teaming agent that drastically reduces the time and effort required for security testing by allowing operators to define complex attack goals using natural language, com…

View →
cs.LGcs.CLcs.CRRecentMay 14, 2026

LiSA: Lifelong Safety Adaptation via Conservative Policy Induction

Minbeom Kim, Lesly Miculicich, Bhavana Dalvi Mishra, Mihir Parmar +5 more

LiSA introduces a conservative policy induction framework that enhances fixed AI guardrails by converting sparse, noisy failure reports into reusable, generalized policies, significantly improving saf…

View →
cs.AIRecentMay 29, 2026

Closed-Loop Neural Activation Control in Vision-Language-Action Models

Abhijith Babu, Ramneet Kaur, Nathaniel D. Bastian, Olivera Kotevska +4 more

The paper proposes CTRL-STEER, a closed-loop framework that adaptively adjusts intervention strength to stabilize concept regulation and improve task success in Vision-Language-Action models without r…

View →
cs.AIcs.LGRecentMay 29, 2026

From Noise to Control: Parameterized Diffusion Policies

Renhao Zhang, Haotian Fu, Mingxi Jia, George Konidaris +2 more

The Parameterized Diffusion Policy (PDP) framework transforms diffusion models from general stochastic generators into precise, steerable tools for learning and adapting complex robotic behaviors by e…

View →
cs.AIRecentMay 31, 2026

"Skill issues'': data-centric optimization of lakehouse agents

Nicole Rose Schneider, Davide Ghilardi, Giacomo Piccinini, Jacopo Tagliabue

The paper introduces a data-centric optimization pipeline to improve coding agents' ability to interact with a branching lakehouse, showing significant accuracy gains by treating agent evaluation as a…

View →
cs.CLcs.AIcs.LGRecentMay 31, 2026

SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories

Zhuoyun Yu, Xin Xie, Wuguannan Yao, Chenxi Wang +3 more

SkillAdaptor is a novel, training-free framework that enables stable, step-level adaptation of external skills for LLM agents by precisely attributing failures to specific skills.

View →
cs.AIRecentMay 27, 2026

You Live More Than Once: Towards Hierarchical Skill Meta-Evolving

Xujun Li, Kehan Zheng, Mingyuan Zhao, Yize Geng +6 more

The paper proposes HiSME, a lightweight hierarchical skill meta-evolving solution that jointly optimizes skills and the skill evolving strategy by learning meta-skills from task execution traces, lead…

View →
cs.LGcs.AIRecentMay 27, 2026

SPAR: Support-Preserving Action Rectification

Jiaxin Zhao, Weihang Pan, Xun Liang, Binbin Lin

SPAR introduces a novel framework that rectifies action policies by performing local fine-tuning in a residual space anchored to a pure behavior cloning policy, achieving state-of-the-art performance…

View →
cs.AIRecentMay 28, 2026

MiraBench: Evaluating Action-Conditioned Reliability in Robotic World Models

Tianzhuo Yang, Zihan Shen, Zirui Mi, Zhaoyi Zhang +6 more

The paper introduces MiraBench, a new benchmark that evaluates the action-conditioned reliability of robotic world models, finding that visual fidelity is insufficient and that optimism bias is a perv…

View →
cs.ROcs.AIRecentMay 29, 2026

DeMaVLA: A Vision-Language-Action Foundation Model for Generalizable Deformable Manipulation

Taiyi Su, Jian Zhu, Tianjian Wang, Youzhang He +8 more

DeMaVLA is a generalizable Vision-Language-Action foundation model designed for deformable object manipulation, achieving strong real-world performance on folding tasks by leveraging large-scale real-…

View →