~ similar to 2606.01828· 20 results
The paper proposes Multi-Order Communication (MOC) to overcome the limitations of standard first-order message passing in LLM-based multi-agent systems, significantly improving performance by capturin…
Zhen Yang, Xiaogang Xu, Wen Wang, Cong Chen +2 more
The paper introduces StreamMA, a streaming multi-agent reasoning system that significantly reduces latency and improves effectiveness by passing reasoning steps to downstream agents as they are genera…
This paper simulates the Argumentative Theory of Reasoning (ATR) using multi-agent debate among LLMs, demonstrating that collective adversarial discourse significantly enhances truth-seeking performan…
The paper demonstrates that for edge-native SLMs used in decentralized governance, simpler, intuitive reasoning (System 1) is significantly more robust and efficient than complex, iterative deliberati…
Jiatan Huang, Mingchen Li, Ziming Li, Sunjae Kwon +2 more
The paper proposes CAGE-CAL, a counterfactual graph calibration framework, to accurately assess the reliability and detect over-confidence in multi-agent LLM systems after agents communicate.
Yaoyang Luo, Zhi Zheng, Ziwei Zhao, Tong Xu +4 more
This paper addresses the threat of coordinated misinformation in LLM-based Multi-Agent Systems by proposing a defense framework, STAR, that effectively identifies and rectifies misleading information…
This paper investigates if team-based interaction improves LLM performance on complex reasoning tasks (ChGK), finding that structured team strategies significantly boost accuracy by acting as error-fi…
The paper introduces POIROT, a novel protocol that uses the agents within a multi-agent system itself to diagnose and detect failures, demonstrating superior performance over traditional evaluation me…
Maharshi Gor, Yoo Yeon Sung, Yu Hou, Eve Fleisig +3 more
This study investigates human-AI collaboration in question answering, finding that while collaboration is beneficial, humans make suboptimal decisions by both under-relying on correct AI suggestions a…
This paper investigates the scaling behavior of homogeneous LLM-driven Multi-Agent Systems (MAS) and finds that performance exhibits diminishing returns due to coordination overhead, rather than scali…
This paper analyzes failure modes in collaborative visual reasoning systems, demonstrating that naive shared workspaces can amplify hallucinations and proposing diagnostics for improving communication…
Wenwu Li, Yuran Song, Mingze Zhao, Bo Jin +1 more
The paper proposes a novel temporal and structural credit assignment framework to efficiently optimize multi-agent LLM systems by decomposing the error signal and using targeted, discrete gradient upd…
The paper introduces a novel framework to quantify faithful confidence expression (FC) in Large Reasoning Models (LRMs), finding that FC remains a significant and challenging reliability target for th…
The paper evaluates dynamic coordination strategy selection for enterprise multi-agent systems, finding that a calibrated default routing approach is effective, even if a deterministic winner-selectio…
Kevin Wang, Anna Thöni, Benjamin Kempinski, Bobby Cheng +49 more
The paper introduces Mindgames, a comprehensive multi-game arena for evaluating LLM agents' sustained social and strategic reasoning, demonstrating that current evaluations are limited by structural s…
The paper introduces Post-Deterministic Distributed Systems (PDDS) as a new model to coordinate autonomous infrastructure where participants, including stochastic agents, produce divergent reasoning p…
Haoming Xu, Weihong Xu, Zongrui Li, Mengru Wang +5 more
The paper introduces Contextual Belief Management (CBM) to address how LLMs should manage accumulating information over long interactions, showing that reinforcement learning significantly improves be…
Jiling Zhou, Aisvarya Adeseye, Seppo Virtanen, Antti Hakkala +1 more
The paper proposes a structured prompt engineering framework to enhance the integrity and reliability of Chain-of-Thought (CoT) reasoning in LLMs, demonstrating significant improvements in security-se…
The paper evaluates multi-agent LLM oracle systems for prediction market resolution, finding that independent aggregation with confidence-weighted voting significantly outperforms single-model baselin…
The paper proposes using an LLM aggregator that analyzes complete reasoning traces, demonstrating that trace-level synthesis is superior to traditional consensus methods like majority voting for solvi…