~ similar to 2605.29561· 20 results
Wanying Ren, Xin Song, Futing Wang, Guoxiu He +1 more
The paper theoretically analyzes the limitations of parameter-based knowledge editing and empirically demonstrates that these methods consistently damage core LLM capabilities compared to retrieval-ba…
Tong Liu, Cheng Qian, Matej Cief, Yuan He +3 more
This paper analyzes tool-calling in LLM agents, demonstrating that evaluation results are highly sensitive to implementation details and proposing new techniques to significantly improve the efficienc…
Marko Kojic, Ivan Bondyrev, Aral de Moor, Joseph Shtok +5 more
Mellum 2 is an open-weight 12B Mixture-of-Experts (MoE) language model specialized for software engineering, achieving performance competitive with larger models while maintaining the efficiency of a…
The paper introduces MAVEN, a lightweight symbolic reasoning scaffold that significantly improves the generalization and end-to-end success rate of large language models in complex, multi-step tool-ca…
The paper introduces and evaluates five parameter alignment strategies that significantly mitigate catastrophic forgetting when continually pretraining multilingual expert language models across multi…
Kou Shi, Ziao Zhang, Shiting Huang, Avery Nie +6 more
The paper introduces AsyncTool, a new benchmark designed to evaluate LLM agents' ability to handle multiple, concurrent tasks with delayed tool feedback, demonstrating that asynchronous coordination i…
The paper introduces Contrastive Reflection (CORE), a novel non-parametric method that rapidly improves language model reasoning by distilling contrasts between successful and unsuccessful problem att…
LongAttnComp introduces a novel, two-stage fine-tuning framework for context compression that significantly improves long-context reasoning performance, matching or exceeding full-context accuracy on…
CoHyDE introduces an iterative co-training framework that jointly optimizes an LLM rewriter and a dense encoder, significantly improving tool retrieval accuracy for LLM agents, especially on vague que…
The paper proposes projectional decoding, a novel framework that integrates a partial graph model alongside text generation to ensure the semantic validity of LLM-generated software artifacts.
Yang He, Xiao Ding, Bibo Cai, Yufei Zhang +4 more
DeepTool introduces a novel Process-Supervised Reinforcement Learning framework to enhance Tool-Integrated Reasoning by explicitly supervising and rewarding intermediate, interleaved deliberation step…
Yifei Zuo, Dhruv Pai, Zhichen Zeng, Alec Dewulf +2 more
The paper introduces Parallax, a scalable and numerically stable parameterized Local Linear Attention mechanism that significantly improves LLM performance and efficiency compared to existing methods…
Tomer Keren, Nitay Calderon, Asaf Yehudai, Yotam Perlitz +2 more
The paper introduces TASTE, an automatic task synthesis method that generates challenging agent benchmarks by evolving tool sequences, demonstrating that existing benchmarks are saturated and that TAS…
Lu Yi, Runlin Lei, Liuyi Yao, Yuexiang Xie +5 more
The paper introduces Adaptive Context Management (AdaCoM), an external context manager that uses reinforcement learning to improve the performance of frozen LLM agents on long-horizon tasks by intelli…
CART introduces a parameter-efficient recurrent transformer architecture that reuses a core block multiple times, but its performance does not surpass a dense baseline, suggesting that weight sharing…
Jianxiang Yu, Jiapeng Zhu, Bochen Lin, Qier Cui +2 more
The paper introduces MASA, a model-aware skill alignment framework that adaptively rewrites general and task-specific skills for LLM agents, achieving superior performance across diverse backbones and…
Divya Tadimeti, Shawn Pan, Sameera Lanka, Chenghui Zhou +1 more
This paper demonstrates that targeted adaptation of the small language model Phi Silica, using dataset curation and fine-tuning, significantly improves its performance in short-form text rewriting, na…
The paper reframes Parameter-Efficient Fine-Tuning (PEFT) from a mere cost-saving alternative to a robust architecture for creating persistent, personalized models that layer specific behaviors onto l…
Garvin Guo, Donglei Yu, Yu Chen, Xiang Wang +5 more
The paper argues that observed gains in multimodal agents using tools may be due to learning tool-calling patterns rather than genuine capability expansion, finding that tool access provides little co…
Wenhang Shi, Yiren Chen, Shuqing Bian, Zhe Zhao +4 more
The paper introduces State-Adaptive Prompt Optimization (SAPO), a novel training strategy that treats prompts as dynamic variables to achieve robust fine-tuning, significantly mitigating catastrophic…