Papers similar to 2605.28008

~ similar to 2605.28008· 20 results

cs.AIRecentMay 27, 2026

Thinking as Compression: Your Reasoning Model is Secretly a Context Compressor

Guoxin Ma, Yibing Liu, Chengzhengxu Li, Yu Liang +6 more

The paper introduces Thinking as Compression (TaC), a novel paradigm showing that the inherent reasoning process of a large language model can naturally compress long context inputs, outperforming ded…

View →

cs.CLcs.LGEmpiricalRecentJun 4, 2026

Latent Reasoning with Normalizing Flows

Guancheng Tu, Xiangjun Fu, Suhao Yu, Yao Tang +4 more

This paper proposes NF-CoT, a latent reasoning framework that preserves the advantages of chain-of-thought in large language models.

View →

cs.CLcs.LGEmpiricalRecentJun 4, 2026

Latent Reasoning with Normalizing Flows

Guancheng Tu, Xiangjun Fu, Suhao Yu, Yao Tang +4 more

This paper proposes NF-CoT, a latent reasoning framework that preserves the advantages of chain-of-thought in large language models.

View →

cs.CLRecentMay 31, 2026

Thinking Economically: A Hierarchical Framework for Adaptive-Complexity Reasoning in LLMs

Yubo Gao, Haotian Wu, Hong Chen, Junquan Huang +7 more

The paper introduces Hierarchical Adaptive Budgeter (HAB), a framework that improves LLM reasoning efficiency by adaptively allocating computational resources to match the intrinsic complexity of both…

View →

cs.CLRecentMay 29, 2026

Unlocking Fine-Grained Translation Quality Estimation in LRMs through Synergistically Evolving Implicit and Explicit Reasoning

Renfei Dang, Xinye Wang, Zhejian Lai, Weilu Xu +4 more

The paper proposes RIEQE, a two-stage training framework that synergistically co-evolves implicit and explicit reasoning capabilities in Large Reasoning Models (LRMs) to significantly improve fine-gra…

View →

cs.CLcs.AIRecentMay 28, 2026

Unlocking the Working Memory of Large Language Models for Latent Reasoning

Lukas Aichberger, Sepp Hochreiter

The paper introduces Reasoning in Memory (RiM), a latent reasoning method that replaces autoregressive token generation with fixed memory blocks to enable compute-efficient internal working memory for…

View →

cs.AIcs.LGRecentJun 1, 2026

Extreme Low-Bit Inference in Reasoning Models: Failure Modes and Targeted Recovery

Ekaterina Alimaskina, Darya Rudas, Denis Shveykin, Gleb Molodtsov +2 more

The paper analyzes the failure modes of aggressive 2-bit quantization in large reasoning models, proposing lightweight controls like FP16 planning and loop rescue to restore accuracy and achieve pract…

View →

cs.LGcs.AIRecentMay 31, 2026

ThinkSwitch: Context Distillation with LoRA and Weight Interpolation for Specific-Purpose Reasoning Tasks

Dhruv Saini, Rohan Pandey

ThinkSwitch introduces a low-compute co-training procedure that distills the reasoning benefit of large language models into weights, significantly improving performance on specific reasoning tasks.

View →

cs.CLcs.AIRecentMay 28, 2026

COFT: Counterfactual-Conformal Decoding for Fair Chain-of-Thought Reasoning in Large Language Models

Arya Fayyazi, Mehdi Kamal, Massoud Pedram

COFT is a training-free decoding method that significantly reduces societal biases in large language model chain-of-thought reasoning by applying token-level fairness control at decode time.

View →

cs.AIEmpiricalRecentJun 9, 2026

ReasonAlloc: Hierarchical Decoding-Time KV Cache Budget Allocation for Reasoning Models

Wenhao Liu, Hao Shi, Yunhe Li, Weizhi Fei +6 more

This paper proposes a training-free framework called ReasonAlloc to mitigate inference bottlenecks in large language models by recasting decoding-time key-value compression as a hierarchical budget al…

View →

cs.AIEmpiricalRecentJun 9, 2026

ReasonAlloc: Hierarchical Decoding-Time KV Cache Budget Allocation for Reasoning Models

Wenhao Liu, Hao Shi, Yunhe Li, Weizhi Fei +6 more

View →

cs.AIRecentMay 29, 2026

SLAT: Segment-Level Adaptive Trimming for Efficient CoT Reasoning

Jian Yao, Xiongcai Luo, Ran Cheng, Kay Chen Tan

The paper proposes SLAT, a segment-level adaptive trimming framework, which efficiently reduces redundant reasoning in large language model CoT outputs by selectively suppressing segments with low mar…

View →

cs.AIcs.CLRecentMay 27, 2026

Revealing Algorithmic Deductive Circuits for Logical Reasoning

Phuong Minh Nguyen, Tien Huu Dang, Naoya Inoue

This paper localizes the attention heads within LLMs responsible for specific reasoning steps, finding that specialized heads handle factual retrieval while higher layers manage global information int…

View →

cs.AIcs.LGRecentMay 27, 2026

Tree of Thoughts as a Classical Heuristic Search Problem: Formal Foundations and Design Patterns

Guni Sharon

This paper unifies the fragmented field of Tree-of-Thoughts (ToT) reasoning by mapping LLM-based search processes onto a formal taxonomy derived from classical heuristic search theory.

View →

cs.CLRecentMay 31, 2026

LongAttnComp: Cross-Family Context Compression for Long-Context Reasoning

Mengmeng Ji, Ravi Shanker Raju, Jonathan Lingjie Li, Chen Wu

LongAttnComp introduces a novel, two-stage fine-tuning framework for context compression that significantly improves long-context reasoning performance, matching or exceeding full-context accuracy on…

View →

cs.AIRecentMay 27, 2026

HRBench: Benchmarking and Understanding Thinking-Mode Switch Strategies in Hybrid-Reasoning LLMs

Yansong Ning, Mianpeng Liu, Jingwen Ye, Weidong Zhang +1 more

The paper introduces HRBench, a unified and comprehensive evaluation framework for systematically benchmarking and comparing various thinking-mode switching strategies in hybrid-reasoning LLMs.

View →

cs.CRRecentApr 28, 2026

R-CoT: A Reasoning-Layer Watermark via Redundant Chain-of-Thought in Large Language Models

Ziming Zhang, Li Li, Guorui Feng, Hanzhou Wu +1 more

The paper proposes R-CoT, a reasoning-layer watermarking framework that embeds ownership watermarks directly into the stable reasoning path of LLMs, achieving high robustness against perturbations.

View →

cs.LGcs.CLRecentJun 1, 2026

HMPO: Hybrid Median-length Policy Optimization for Chain-of-Thought Compression

Minghui Zheng, Hongxu Chen, Huimin Ren, Hongsheng Xin +7 more

HMPO introduces a single-stage, cost-effective reinforcement learning framework that achieves significant token compression of Chain-of-Thought reasoning with minimal loss of accuracy, applicable acro…

View →

cs.AIRecentMay 28, 2026

Diagnosing Harmful Continuation in Answer-Correct Long-CoT Training Traces

Chen He, Yuhao Wu, Lei Wang, Wenxuan Zhang +1 more

The paper identifies and demonstrates that post-conclusion continuation in answer-correct long-CoT traces is harmful during LLM fine-tuning, proposing a method to cut this continuation.

View →

cs.CLcs.AIRecentJun 1, 2026

A Primer in Post-Training Reasoning Data: What We Know About How It Works

Yaoming Li, Guangxiang Zhao, Qilong Shi, Lin Sun +2 more

This paper synthesizes over 150 scattered studies and reports to provide the first comprehensive primer on post-training reasoning data, organizing the field around data objects, utility, construction…

View →