Papers similar to 2605.29116

~ similar to 2605.29116· 20 results

cs.AIcs.CLRecentMay 28, 2026

Locally Coherent, Globally Incoherent: Bounding Compositional Incoherence in Multi-Component LLM Agents

The paper introduces a metric, the compositional residual eps*, to quantify how multi-component LLM agents violate basic probability axioms when combining local, coherent claims into a global predicti…

View →

cs.AIRecentMay 27, 2026

TRACER: Turn-level Regret Matching with Inner Reinforcement Credit for Cooperative Multi-LLM Reasoning

Chusen Li, Zhou Liu, Shuigeng Zhou, Wentao Zhang

TRACER introduces a novel turn-level reinforcement framework that enables cooperative multi-LLM reasoning by separating decision-making into a regret-matching controller and a generation-credit layer.

View →

cs.MAcs.AIRecentMay 29, 2026

Design and Evaluation of Multi-Agent AI Oracle Systems for Prediction Market Resolution

Tarun Kota

The paper evaluates multi-agent LLM oracle systems for prediction market resolution, finding that independent aggregation with confidence-weighted voting significantly outperforms single-model baselin…

View →

cs.AIRecentMay 28, 2026

TRACE: Toulmin-based Reasoning Assessment through Constructive Elements for LLM CoT Evaluation

Yundong Kim, Heyoung Yang

The paper introduces TRACE, a novel metric that evaluates the logical structure of LLM reasoning (CoT) by integrating Toulmin's argumentation theory, demonstrating that sound reasoning structure corre…

View →

cs.MAcs.AIcs.CLRecentMay 28, 2026

Social Reasoning in Machines: Investigating Collective Truth-Seeking Dynamics in Large Language Model Debate

Tom Pecher

This paper simulates the Argumentative Theory of Reasoning (ATR) using multi-agent debate among LLMs, demonstrating that collective adversarial discourse significantly enhances truth-seeking performan…

View →

cs.CLcs.SERecentMay 29, 2026

Combinatorial Synthesis: Scaling Code RLVR via Atomic Decomposition and Recombination

Jiasheng Zheng, Boxi Cao, Boxi Yu, Yuzhong Zhang +5 more

The paper introduces Atomic Decomposition and Recombination (ADR), a novel framework that generates genuinely novel and challenging verifiable code tasks, significantly improving the scalability of Re…

View →

cs.SEcs.AIRecentMay 28, 2026

Agora: Toward Autonomous Bug Detection in Production-Level Consensus Protocols with LLM Agents

Xiang Liu, Sa Song, Zhaowei Zhang, Huiying Lan +5 more

The paper introduces Agora, a domain-aware multi-agent framework that successfully detects deep, previously unknown logic bugs in complex consensus protocols, outperforming existing LLM-based analysis…

View →

cs.LGcs.ARRecentJun 2, 2026

MOSAIC: Efficient Mixture-of-Agent Scheduling via Adaptive Aggregation and Inference Concurrency

Saptarshi Mitra, Yifan Zhang, Rachid Karami, Phyo Pyae Moe Aung +4 more

MOSAIC is a novel scheduling framework that significantly accelerates Mixture-of-Agents (MoA) workloads by jointly optimizing expert placement and utilizing confidence-aware adaptive aggregation.

View →

cs.AIcs.CLcs.CRRecentApr 18, 2026

The Cognitive Penalty: Ablating System 1 and System 2 Reasoning in Edge-Native SLMs for Decentralized Consensus

Syed Muhammad Aqdas Rizvi

The paper demonstrates that for edge-native SLMs used in decentralized governance, simpler, intuitive reasoning (System 1) is significantly more robust and efficient than complex, iterative deliberati…

View →

cs.CLRecentMay 28, 2026

Counterfactual Graph for Multi-Agent LLM Calibration

Jiatan Huang, Mingchen Li, Ziming Li, Sunjae Kwon +2 more

The paper proposes CAGE-CAL, a counterfactual graph calibration framework, to accurately assess the reliability and detect over-confidence in multi-agent LLM systems after agents communicate.

View →

cs.LGcs.AIcs.CLRecentMay 28, 2026

Reasoning with Sampling: Cutting at Decision Points

Felix Zhou, Anay Mehrotra, Quanquan C. Liu

The paper introduces Entropy-Cut Metropolis-Hastings, an efficient sampling method that uses next-token entropy to identify and resample from critical decision points in a reasoning trace, significant…

View →

cs.MAcs.AIcs.GTRecentMay 28, 2026

Evolutionary Dynamics of Cooperation in Next-Generation LLM Agent Systems: A Cross-Provider Empirical Extension

Francisco León Zúñiga Bolívar

The study extends cooperative bias testing across diverse, next-generation LLMs, finding that provider identity is a stronger predictor of cooperative equilibrium than model generation, and that noise…

View →

cs.AIRecentJun 1, 2026

POIROT: Interrogating Agents for Failure Detection in Multi-Agent Systems

Iñaki Dellibarda Varela, R. Sendra-Arranz, Pablo Romero-Sorozabal, J. M. Valverde-García +4 more

The paper introduces POIROT, a novel protocol that uses the agents within a multi-agent system itself to diagnose and detect failures, demonstrating superior performance over traditional evaluation me…

View →

cs.AIRecentMay 30, 2026

FALAT: Tracing Failures in LLM Agent Trajectories via Dependency-Guided Search

Md Nakhla Rafi, Md Ahasanuzzaman, Dong Jae Kim, Zhijie Wang +1 more

FALAT is a diagnostic framework that treats failure attribution in complex LLM agent trajectories as a dependency-guided search problem, successfully identifying both the responsible agent and the dec…

View →

cs.CLcs.AIcs.MARecentJun 3, 2026

Streaming Communication in Multi-Agent Reasoning

Zhen Yang, Xiaogang Xu, Wen Wang, Cong Chen +2 more

The paper introduces StreamMA, a streaming multi-agent reasoning system that significantly reduces latency and improves effectiveness by passing reasoning steps to downstream agents as they are genera…

View →

cs.MAcs.AIRecentMay 28, 2026

Evolve as a Team: Collaborative Self-Evolution for LLM-based Multi-Agent Systems

Zhezheng Hao, Tianfu Wang, Huanshuo Dong, Ziyan Liu +6 more

The paper proposes Meta-Team, an experience-driven framework that enables multi-agent systems (MAS) to collaboratively self-evolve by transforming complex execution experiences into reusable improveme…

View →

cs.SEcs.AIcs.MARecentMay 31, 2026

LLM Consortium for Software Design Refinement: A Controlled Experiment on Multi-Agent Collaboration Topologies

Nagarjuna Kanamarlapudi, Praveen K

The paper experimentally evaluates 12 multi-agent LLM collaboration topologies for software design, finding that structural adversarial prompting and cross-model review are the most effective approach…

View →

cs.AIcs.CLcs.IRRecentMay 31, 2026

Don't Ask the LLM to Track Freshness: A Deterministic Recipe for Memory Conflict Resolution

Vikas Reddy, Sumanth Challaram

The paper proposes a deterministic, version-aware aggregation method that significantly outperforms existing LLM-based systems for resolving memory conflicts in fact consolidation tasks.

View →

cs.AIRecentMay 27, 2026

Deconstructing Spatial Complexity: Hierarchical Decomposition for LLM Spatial Reasoning

Yi Wang, Haojie Lu, Zhaofan Zhang, Li Chen +1 more

This paper introduces MCTS-Guided Group Relative Policy Optimization (M-GRPO) to enhance LLM spatial reasoning by improving the decomposition of complex tasks into optimal sub-tasks.

View →

cs.AIcs.CLcs.LORecentMay 27, 2026

Satisfiability Solving with LLMs: A Matched-Pair Evaluation of Reasoning Capability

Leizhen Zhang, Shuhan Chen, Sheng Chen

The paper evaluates LLM reasoning on Boolean satisfiability (SAT) problems, concluding that conventional metrics are misleading and proposing a paired-formula protocol with Accurate Differentiation Ra…

View →