Papers similar to 2606.00880

~ similar to 2606.00880· 18 results

cs.AIcs.CRcs.CYRecentApr 16, 2026

Layered Mutability: Continuity and Governance in Persistent Self-Modifying Agents

The paper introduces 'layered mutability,' a framework for analyzing how persistent self-modifying AI agents drift away from intended behavior due to the accumulation of locally reasonable, uncoordina…

View →

cs.LGcs.AIRecentJun 2, 2026

Using Reward Uncertainty to Induce Diverse Behaviour in Reinforcement Learning

Anthony GX-Chen, Ankit Anand, Gheorghe Comanici, Zaheer Abbas +6 more

The paper proposes a novel RL framework that naturally induces diverse agent behavior by reformulating the objective to treat the reward as a distribution over functions, making diversity a rational r…

View →

cs.LGcs.AIRecentMay 30, 2026

Behavior-Invariant Task Representation Learning with Transformer-based World Models for Offline Meta-Reinforcement Learning

Fuyuan Qian, Menglong Zhang, Song Wang, Quanying Liu

The paper proposes a novel framework combining behavior-invariant task representation learning and a Transformer-based world model to achieve robust generalization in offline meta-reinforcement learni…

View →

cs.AIRecentMay 30, 2026

Decoupled Behavioral Cloning for Scalable Inductive Generalization in RL from Specifications

Vignesh Subramanian, Subhajit Roy, Suguman Bansal

The paper proposes DIBS, a decoupled behavioral cloning approach that stabilizes inductive generalization in RL by separating task-specific policy learning from the evolution function, leading to impr…

View →

cs.AIRecentJun 1, 2026

EvoBrain: Continual Learning of EEG Foundation Models Across Heterogeneous BCI Tasks

Yangxuan Zhou, Sha Zhao, Jiquan Wang, Shijian Li +1 more

EvoBrain proposes a dynamic, cross-task continual learning framework to overcome the limitations of task-specific EEG decoding, enabling unified and scalable brain-computer interfaces.

View →

cs.AIcs.CLRecentJun 1, 2026

AGENTCL: Toward Rigorous Evaluation of Continual Learning in Language Agents

Yiheng Shu, Bernal Jiménez Gutiérrez, Saisri Padmaja Jonnalagedda, Yuguang Yao +2 more

The paper introduces AGENTCL, a rigorous evaluation framework that uses controlled task streams to accurately measure an agent's ability to accumulate and reuse knowledge across multiple tasks, thereb…

View →

cs.LGcs.AIRecentMay 29, 2026

Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO

Yiming Ren, Yiran Xu, Zicheng Lin, Chufan Shi +7 more

The paper proposes S2L-PO, a framework that uses smaller, naturally diverse models as structured explorers to enhance the policy-level diversity and performance of larger language models during traini…

View →

cs.CLRecentMay 29, 2026

TRACE: Discovering Task-Specific Parameter via Adaptation-Aware Probing for Continual Fine-Tuning

Xiaosong Han, Ke Chen, Xindi Dai, Di Liang +6 more

TRACE proposes a novel method to mitigate catastrophic forgetting in continual LLM fine-tuning by identifying and isolating a small, task-specific subset of essential parameters for each task.

View →

cs.CLRecentMay 30, 2026

Learning to Retrieve: Dual-Level Long-Term Memory for Text-to-SQL Agents

Yibo Wang, Nikki Lijing Kuang, Philip S. Yu, Zhewei Yao +1 more

The paper proposes MERIT, a dual-level, multi-horizon memory retrieval framework that significantly improves the performance of interactive text-to-SQL agents by providing both global and local memory…

View →

cs.LGcs.AIRecentMay 29, 2026

Skill Reuse as Compression in Agentic RL

Zhikun Xu, Yu Feng, Jacob Dineen, Taiwei Shi +2 more

The paper proposes ReuseRL, a method that improves agent generalization in Reinforcement Learning by enforcing structural compressibility of successful agent trajectories into reusable skills.

View →

cs.LGcs.AIRecentJun 1, 2026

Repurposing Adversarial Perturbations for Continual Learning: From Defense to Active Alignment

Ran Liu, Min Yu, Mingqi Liu, Jianguo Jiang +6 more

The paper introduces AdvCL, a framework that repurposes adversarial perturbations as a geometric control signal to stabilize continual learning in large language models, significantly reducing forgett…

View →

cs.AIcs.LGstat.MLRecentJun 1, 2026

ReSkill: Reconciling Skill Creation with Policy Optimization in Agentic RL

Zelin He, Haotian Lin, Boran Han, Wei Zhu +5 more

ReSkill is an RL-in-the-loop framework that reconciles skill creation and policy optimization by automatically creating, testing, and refining modular skills alongside the agent's policy learning, lea…

View →

cs.LGcs.AIcs.CLRecentMay 27, 2026

Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents

Suji Kim, Kangsan Kim, Sung Ju Hwang

The paper introduces LearnWeak, an annotation-free framework that automatically specializes small computer-use agents by identifying and targeting their specific weaknesses using a stronger reference…

View →

cs.LGcs.AIRecentMay 28, 2026

Overcoming Forgetting in LLM Fine-Tuning with Evolution Strategies

Kajetan Schweighofer, Conor F. Hayes, Roberto Dailey, Risto Miikkulainen +1 more

This paper introduces Anchored Weight Decay (AWD), a regularization technique that effectively prevents prior-task forgetting during LLM fine-tuning with Evolution Strategies (ES), positioning ES as a…

View →

cs.AIRecentJun 1, 2026

Joint Agent Memory and Exploration Learning via Novelty Signals

Shizuo Tian, Xiaohong Weng, Rui Kong, Yuxuan Chen +8 more

The JAMEL framework addresses the challenge of effective exploration in open-ended environments by jointly training agent memory and exploration policies using natural, novelty-driven signals.

View →

cs.AIcs.CLcs.LGRecentMay 27, 2026

Where Rollouts Begin: Low-Load, High-Leverage First-Token Diversification for RLVR

Soeun Kim, Albert No

The paper introduces REFT, a novel method that diversifies rollouts by sampling the first token after the reasoning marker, significantly improving performance in Reinforcement Learning with Verifiabl…

View →

cs.AIRecentJun 1, 2026

TERRA: Task-Embedded Reasoning and Representation Architecture for Cross-Domain Applications

Shayan Shokri

The paper formally addresses the challenging question of cross-domain transferability of latent predictive models by proposing a structured framework that quantifies the relationship between source an…

View →

cs.LGcs.AIRecentMay 27, 2026

TIMEGATE: Sustainable Time-Boxed Promotion Gates for Continual ML Adaptation Under Resource Constraints

Abhijit Chakraborty, Suddhasvatta Das, Yash Shah, Vivek Gupta +1 more

TIMEGATE introduces a resource-aware policy layer that manages continual ML adaptation by dynamically budgeting time and evaluation resources, achieving significant compute and energy savings without…

View →