~ similar to 2605.30052· 20 results
Nizar Islah, Istabrak Abbes, Irina Rish, Sarath Chandar +1 more
This paper proposes a method to recover recoverability structure from failed traces of post-trained language models, enabling test-time routing and post-training analysis.
ACRFence introduces a framework-agnostic mitigation to prevent semantic rollback attacks in LLM agents by recording irreversible tool effects and enforcing strict replay-or-fork semantics upon checkpo…
The paper analyzes the failure modes of aggressive 2-bit quantization in large reasoning models, proposing lightweight controls like FP16 planning and loop rescue to restore accuracy and achieve pract…
Pramana introduces a standardized, protocol-level wire format for autonomous agent outputs, ensuring that every consequential claim is accompanied by a verifiable artifact that can be re-executed by a…
Ali Al-Kaswan, Maksim Plotnikov, Maxim Hájek, Roland Vízner +2 more
The paper introduces DeepRed, a new benchmark for evaluating LLM agents in realistic CTF challenges, finding that current agents are limited, achieving only 35% average checkpoint completion.
The paper introduces the Reconstructive Authority Model (RAM), a novel framework that proves execution validity by assessing state coverage rather than just state integrity, showing that existing atte…
The paper identifies and measures a critical failure mode where LLM agents violate policies by losing or corrupting directive-bearing state during the process of assembling the decision context, and p…
Xinle Deng, Ruobin Zhong, Hujin Peng, Xiaoben Lu +14 more
The paper introduces MemTrace, a framework that treats LLM memory pipelines as traceable graphs to systematically diagnose and automatically correct memory-related errors, boosting performance by up t…
The paper demonstrates that self-reflective agents can systematically confabulate incorrect memories, leading them to fail tasks even when the environment resets, and proposes a metric and mitigation…
Wenhan Chang, Tianqing Zhu, Ping Xiong, Faqian Guan +1 more
The paper proposes Two-stage Backdoor Hijacking (TSBH) to create persistent, trigger-activated malicious behaviors by manipulating the observable Chain-of-Thought (CoT) process in Large Language Model…
Xianyou Li, Weiran Yan, Yichao Wu, Penghao Liang +3 more
This paper introduces a failure-aware observability framework to diagnose wasted computation in multi-agent LLM systems by mapping recurring failure modes to online trace signals.
The paper proves that platform-deterministic inference is a necessary and sufficient condition for trustworthy AI, establishing that AI trust fundamentally relies on consistent arithmetic.
Kunhao Zheng, Pierre Chambon, Juliette Decugis, Jonas Gehring +3 more
The paper demonstrates that extrapolative weight averaging can effectively navigate and extend the correctness-efficiency frontier in code RL, leading to improved performance on complex programming ta…
The paper introduces Sovereign Agentic Loops (SAL), a control-plane architecture that decouples LLM reasoning from system execution to enhance safety and reliability in real-world AI agents.
The paper demonstrates that using Reinforcement Learning from Verifiable Rewards (RLVR) significantly improves small language models' functional correctness in code generation, particularly when combi…
Jiahao Huang, Fei Cheng, Junfeng Jiang, Zefan Yu +1 more
The paper introduces BenchTrace, a novel benchmark designed to rigorously evaluate the self-evolution and reflection capabilities of LLM agents, revealing that current models struggle with accurate fa…
The paper introduces CROP, a novel conformal procedure that provides rigorous statistical guarantees for certifying the longest safe prefix of a language model's reasoning trace, allowing for targeted…
The paper introduces Post-Deterministic Distributed Systems (PDDS) as a new model to coordinate autonomous infrastructure where participants, including stochastic agents, produce divergent reasoning p…
The paper introduces a self-healing agentic orchestrator that significantly improves the reliability of tool-augmented LLM systems by treating failure as a bounded runtime control problem, achieving h…
Sen Fang, Weiyuan Ding, Zhezhen Cao, Zhou Yang +1 more
AEGIS is a novel multi-agent framework that grounds vulnerability reasoning by reconstructing per-variable dependency chains over a Code Property Graph, achieving state-of-the-art performance on the P…