ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2605.30042· 20 results

cs.MAcs.AIRecentMay 28, 2026

Evolve as a Team: Collaborative Self-Evolution for LLM-based Multi-Agent Systems

Zhezheng Hao, Tianfu Wang, Huanshuo Dong, Ziyan Liu +6 more

The paper proposes Meta-Team, an experience-driven framework that enables multi-agent systems (MAS) to collaboratively self-evolve by transforming complex execution experiences into reusable improveme…

View →
cs.AIcs.LGRecentMay 30, 2026

MOSAIC: Modular Orchestration for Structured Agentic Intelligence and Composition

Yifan Bao, Xinyu Xi, Xinyu Liu, Wen Ge +7 more

MOSAIC introduces a structured agentic framework that treats automated data science as a staged, context-grounded model selection problem, improving performance and traceability over traditional AutoM…

View →
cs.AIRecentMay 29, 2026

AutoSci: A Memory-Centric Agentic System for the Full Scientific Research Lifecycle

Weitong Qian, Beicheng Xu, Zhongao Xie, Bowen Fan +15 more

AutoSci is a memory-centric agentic system designed to automate the entire scientific research lifecycle by integrating structured memory, multi-stage execution, and continuous self-improvement.

View →
cs.AIRecentMay 27, 2026

A Unified Framework for the Evaluation of LLM Agentic Capabilities

Pengyu Zhu, Lijun Li, Yaxing Lyu, Qianxin Luo +7 more

The paper introduces a unified framework to fairly evaluate LLM agentic capabilities by standardizing diverse benchmarks and separating the effects of the LLM model from the surrounding framework and…

View →
cs.AIRecentJun 1, 2026

POIROT: Interrogating Agents for Failure Detection in Multi-Agent Systems

Iñaki Dellibarda Varela, R. Sendra-Arranz, Pablo Romero-Sorozabal, J. M. Valverde-García +4 more

The paper introduces POIROT, a novel protocol that uses the agents within a multi-agent system itself to diagnose and detect failures, demonstrating superior performance over traditional evaluation me…

View →
cs.CLRecentJun 1, 2026

HarnessForge: Joint Harness and Policy Evolution for Adaptive Agent Systems

Mingju Chen, Can Lv, Guibin Zhang, Heng Chang +1 more

HarnessForge introduces a meta-adaptive framework that jointly evolves the execution structure (harness) and the reasoning policy of LLM agents, significantly improving overall system performance acro…

View →
cs.AIcs.CLRecentJun 4, 2026

MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery

Shangheng Du, Xiangchao Yan, Jinxin Shi, Zongsheng Cao +10 more

MLEvolve is a novel self-evolving multi-agent framework that enables LLM agents to discover and optimize machine learning algorithms for complex, long-horizon tasks.

View →
cs.AIcs.CLEmpiricalRecentJun 11, 2026

EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery

Amy Xin, Jiening Siow, Junjie Wang, Zijun Yao +4 more

This paper presents EurekAgent, an environment-engineered agent system for metric-driven autonomous scientific discovery.

View →
cs.LGcs.AIRecentMay 29, 2026

Learning to Construct Practical Agentic Systems

Aditya Kumar, Zhihan Lei, Jerry Yan, Joshua W. Momo +5 more

The paper proposes a modular agent framework and novel learning methods to design and optimize practical, cost-effective, and controllable LLM-based agentic systems.

View →
cs.IRcs.AIRecentMay 27, 2026

Do Agents Need Semantic Metadata? A Comparative Study in Agentic Data Retrieval

Shiyu Chen, Tarfah Alrashed, Alon Halevy, Natasha Noy

The study compares agentic data retrieval using unstructured web data versus structured, semantically-annotated datasets, concluding that semantic metadata remains essential for high-precision, reliab…

View →
cs.CLcs.AIRecentMay 30, 2026

Skill or Skip? Learning Selective Skill Invocation in Agentic Tasks via Dual-Granularity Preference Learning

Chishui Chen, Jiaye Lin, Te Sun, Junxi Wang +5 more

SelSkill introduces a dual-granularity preference learning framework that treats skill use as a 'skill-or-skip' decision, significantly improving agent performance and execution precision in complex a…

View →
cs.MAcs.AIcs.CLRecentMay 30, 2026

Dynamic Coordination Strategy Selection for Enterprise Multi-Agent Systems

Thanh Luong Tuan

The paper evaluates dynamic coordination strategy selection for enterprise multi-agent systems, finding that a calibrated default routing approach is effective, even if a deterministic winner-selectio…

View →
cs.SEcs.AIRecentJun 3, 2026

From Prompt to Process: a Process Taxonomy and Comparative Assessment of Frameworks Supporting AI Software Development Agents

Sanderson Oliveira de Macedo

This paper studies AI development frameworks for software engineering and proposes a six-dimension process taxonomy.

View →
cs.LGcs.AIEmpiricalRecentJun 10, 2026

APPO: Agentic Procedural Policy Optimization

Xucong Wang, Ziyu Ma, Yong Wang, Yuxiang Ji +4 more

This paper proposes a new method for agentic Reinforcement Learning called Agentic Procedural Policy Optimization (APPO) that improves tool-use capabilities by assigning credit to fine-grained decisio…

View →
cs.AIRecentJun 1, 2026

Learning When Not to Act: Mitigating Tool Abuse in Agentic Reinforcement Learning

Liuji Chen, Dianxing Tang, Xing Shi, Dingshuo Chen +3 more

The paper proposes EAPO, a framework that enables agentic models to learn when to forgo using external tools, thereby mitigating tool abuse while maintaining high reasoning accuracy.

View →
cs.AIRecentMay 31, 2026

SkillSmith: Co-Evolving Skills and Tools for Self-Improving Agent Systems

Yangbo Wei, Zhen Huang, Shaoqiang Lu, Junhong Qian +3 more

SkillSmith is a synergy-aware framework that jointly co-evolves skills and tools, significantly improving self-improving agent systems by modeling skill-tool interactions and diagnosing failures.

View →
cs.CLcs.AIcs.IRRecentMay 28, 2026

Exploring Autonomous Agentic Data Engineering for Model Specialization

Yujie Luo, Xiangyuan Ru, Jingsheng Zheng, Jingjing Wang +9 more

The paper introduces Autonomous Agentic Data Engineering, demonstrating that LLMs can autonomously plan and optimize end-to-end data curation pipelines, leading to substantial performance gains in spe…

View →
cs.AIcs.CReess.SYRecentMay 4, 2026

Stable Agentic Control: Tool-Mediated LLM Architecture for Autonomous Cyber Defense

Kerri Prinos, Lilianne Brush, Cameron Denton, Zhanqi Wang +4 more

The paper proposes a tool-mediated LLM architecture for autonomous cyber defense, formally proving its stability and demonstrating that it significantly reduces an attacker's expected payoff in real-w…

View →
cs.AIRecentMay 31, 2026

"Skill issues'': data-centric optimization of lakehouse agents

Nicole Rose Schneider, Davide Ghilardi, Giacomo Piccinini, Jacopo Tagliabue

The paper introduces a data-centric optimization pipeline to improve coding agents' ability to interact with a branching lakehouse, showing significant accuracy gains by treating agent evaluation as a…

View →
cs.AIRecentMay 30, 2026

FALAT: Tracing Failures in LLM Agent Trajectories via Dependency-Guided Search

Md Nakhla Rafi, Md Ahasanuzzaman, Dong Jae Kim, Zhijie Wang +1 more

FALAT is a diagnostic framework that treats failure attribution in complex LLM agent trajectories as a dependency-guided search problem, successfully identifying both the responsible agent and the dec…

View →