Papers similar to 2605.28678

~ similar to 2605.28678· 19 results

cs.AIRecentMay 27, 2026

HRBench: Benchmarking and Understanding Thinking-Mode Switch Strategies in Hybrid-Reasoning LLMs

Yansong Ning, Mianpeng Liu, Jingwen Ye, Weidong Zhang +1 more

The paper introduces HRBench, a unified and comprehensive evaluation framework for systematically benchmarking and comparing various thinking-mode switching strategies in hybrid-reasoning LLMs.

View →

cs.AIRecentJun 1, 2026

eMoT: evolving Memory-of-Thought via Symbolic Anchoring and Memory Corrosion

Xiang Li, Jiwei Wei, Ke Liu, Yitong Qin +4 more

The eMoT framework enhances multi-step reasoning in LLMs by treating reasoning as an evolving memory, stabilizing performance through symbolic computation and structured refinement.

View →

cs.AIcs.LGRecentJun 1, 2026

Extreme Low-Bit Inference in Reasoning Models: Failure Modes and Targeted Recovery

Ekaterina Alimaskina, Darya Rudas, Denis Shveykin, Gleb Molodtsov +2 more

The paper analyzes the failure modes of aggressive 2-bit quantization in large reasoning models, proposing lightweight controls like FP16 planning and loop rescue to restore accuracy and achieve pract…

View →

cs.AIRecentJun 1, 2026

TRON: Targeted Rule-Verifiable Online Environments for Visual Reasoning RL

Tianze Yang, Yucheng Shi, Ruitong Sun, Jingyuan Huang +2 more

The paper introduces TRON, an online, rule-verifiable environment substrate that generates an unbounded stream of fresh, controllable visual reasoning training instances, significantly improving RL pe…

View →

cs.CLcs.AIRecentMay 31, 2026

Hybrid Verified Decoding: Learning to Allocate Verification in Speculative Decoding

Xin Su, Dawid Majchrowski, Fangyuan Yu, Vanshil Atul Shah +4 more

The paper introduces Hybrid Verified Decoding, a method that predicts the acceptance length of a cache draft to intelligently select between cache verification and model-based drafting, achieving sign…

View →

cs.CLcs.AIRecentMay 28, 2026

Unlocking the Working Memory of Large Language Models for Latent Reasoning

Lukas Aichberger, Sepp Hochreiter

The paper introduces Reasoning in Memory (RiM), a latent reasoning method that replaces autoregressive token generation with fixed memory blocks to enable compute-efficient internal working memory for…

View →

cs.AIcs.LGRecentMay 27, 2026

Zipping the Thought: When and How Compressed Reasoning Data Works in LLM Post-Training

Kohsei Matsutani, Gouki Minegishi, Takeshi Kojima, Yusuke Iwasawa +1 more

This paper investigates how different types of compressed reasoning data (Explicit, Composed, Implicit CoT) affect LLM performance during post-training, finding that the choice of compression and subs…

View →

cs.AIcs.CLcs.LGRecentMay 27, 2026

Where Rollouts Begin: Low-Load, High-Leverage First-Token Diversification for RLVR

Soeun Kim, Albert No

The paper introduces REFT, a novel method that diversifies rollouts by sampling the first token after the reasoning marker, significantly improving performance in Reinforcement Learning with Verifiabl…

View →

cs.CLcs.AIRecentJun 1, 2026

A Primer in Post-Training Reasoning Data: What We Know About How It Works

Yaoming Li, Guangxiang Zhao, Qilong Shi, Lin Sun +2 more

This paper synthesizes over 150 scattered studies and reports to provide the first comprehensive primer on post-training reasoning data, organizing the field around data objects, utility, construction…

View →

cs.AIRecentMay 30, 2026

ForeSci: Evaluating LLM Agents for Forward-Looking AI Research Judgment

Qiuyu Tian, Zequn Liu, Yingce Xia, Haojie Yin +1 more

The paper introduces ForeSci, a novel benchmark that evaluates LLM agents' ability to make forward-looking research judgments using only historical evidence, finding that explicit evidence organizatio…

View →

cs.AIcs.CLcs.LGRecentMay 28, 2026

When Should Models Change Their Minds? Contextual Belief Management in Large Language Models

Haoming Xu, Weihong Xu, Zongrui Li, Mengru Wang +5 more

The paper introduces Contextual Belief Management (CBM) to address how LLMs should manage accumulating information over long interactions, showing that reinforcement learning significantly improves be…

View →

cs.AIcs.MARecentMay 29, 2026

MindZero: Learning Online Mental Reasoning With Zero Annotations

Shunchi Zhang, Jin Lu, Chuanyang Jin, Yichao Zhou +2 more

MindZero introduces a self-supervised reinforcement learning framework that trains multimodal large language models (MLLMs) for efficient and robust online mental reasoning without requiring explicit…

View →

cs.AIRecentMay 27, 2026

Look on Demand: A Cognitive Scheduling Framework for Visual Evidence Acquisition in Multimodal Reasoning

Yang Zhang, Xiaoshuai Sun, Rui Zhao, Wujin Sun +4 more

The paper proposes CSMR, a cognitive scheduling framework that allows a language model to dynamically decide when to acquire task-relevant visual evidence, significantly improving multimodal reasoning…

View →

cs.AIcs.LGRecentMay 27, 2026

Tree of Thoughts as a Classical Heuristic Search Problem: Formal Foundations and Design Patterns

Guni Sharon

This paper unifies the fragmented field of Tree-of-Thoughts (ToT) reasoning by mapping LLM-based search processes onto a formal taxonomy derived from classical heuristic search theory.

View →

cs.CVcs.AIcs.CLRecentMay 29, 2026

SpatialAct: Probing Spatial Reasoning-to-Action Capabilities of VLM Agents in 3D Scenes

Tianhui Liu, Jie Feng, Zhiheng Zheng, Shengyuan Wang +5 more

The paper introduces SpatialAct, a challenging benchmark that reveals a significant 'reasoning-to-action gap,' showing that current VLMs struggle to maintain coherent spatial understanding and perform…

View →

cs.CLRecentMay 29, 2026

Unlocking Fine-Grained Translation Quality Estimation in LRMs through Synergistically Evolving Implicit and Explicit Reasoning

Renfei Dang, Xinye Wang, Zhejian Lai, Weilu Xu +4 more

The paper proposes RIEQE, a two-stage training framework that synergistically co-evolves implicit and explicit reasoning capabilities in Large Reasoning Models (LRMs) to significantly improve fine-gra…

View →

cs.AIRecentMay 27, 2026

Towards Faithful Agentic XAI: A Verification Method and an Open-World Benchmark for Better Model Faithfulness

Jaechang Kim, Sunung Mun, Seungjoon Lee, Jaewoong Cho +1 more

The paper proposes Faithful Agentic XAI (FAX), a verification framework that explicitly checks LLM-generated explanations against model behavior, significantly improving explanation faithfulness on a…

View →

cs.LGcs.AIRecentMay 31, 2026

RLVR without Ineffective Samples: Group Prioritized Off-Policy Optimization for LLM Reasoning

Yixiu Mao, Yun Qu, Qi Wang, Heming Zou +1 more

The paper introduces Group Prioritized Off-Policy Optimization (POPO), a novel framework that efficiently accelerates RL finetuning for LLM reasoning by leveraging effective off-policy training batche…

View →

cs.CLcs.LGEmpiricalRecentJun 4, 2026

Latent Reasoning with Normalizing Flows

Guancheng Tu, Xiangjun Fu, Suhao Yu, Yao Tang +4 more

This paper proposes NF-CoT, a latent reasoning framework that preserves the advantages of chain-of-thought in large language models.

View →