Papers similar to 2605.29192

~ similar to 2605.29192· 20 results

cs.AIRecentMay 28, 2026

TRACE: Toulmin-based Reasoning Assessment through Constructive Elements for LLM CoT Evaluation

The paper introduces TRACE, a novel metric that evaluates the logical structure of LLM reasoning (CoT) by integrating Toulmin's argumentation theory, demonstrating that sound reasoning structure corre…

View →

cs.LGcs.AIRecentMay 31, 2026

ThinkSwitch: Context Distillation with LoRA and Weight Interpolation for Specific-Purpose Reasoning Tasks

Dhruv Saini, Rohan Pandey

ThinkSwitch introduces a low-compute co-training procedure that distills the reasoning benefit of large language models into weights, significantly improving performance on specific reasoning tasks.

View →

cs.CLcs.IRRecentJun 3, 2026

Caliper: Probing Lexical Anchors versus Causal Structure in LLMs

Zhenyu Yu, Shuigeng Zhou

This paper evaluates the causal reasoning abilities of large language models and finds that they rely heavily on lexical pattern matching rather than structural reasoning.

View →

cs.AIRecentMay 28, 2026

Diagnosing Harmful Continuation in Answer-Correct Long-CoT Training Traces

Chen He, Yuhao Wu, Lei Wang, Wenxuan Zhang +1 more

The paper identifies and demonstrates that post-conclusion continuation in answer-correct long-CoT traces is harmful during LLM fine-tuning, proposing a method to cut this continuation.

View →

cs.AIRecentMay 27, 2026

CORE: Contrastive Reflection Enables Rapid Improvements in Reasoning

Linas Nasvytis, Simon Jerome Han, Ben Prystawski, Satchel Grant +2 more

The paper introduces Contrastive Reflection (CORE), a novel non-parametric method that rapidly improves language model reasoning by distilling contrasts between successful and unsuccessful problem att…

View →

cs.AIcs.CLRecentMay 27, 2026

Revealing Algorithmic Deductive Circuits for Logical Reasoning

Phuong Minh Nguyen, Tien Huu Dang, Naoya Inoue

This paper localizes the attention heads within LLMs responsible for specific reasoning steps, finding that specialized heads handle factual retrieval while higher layers manage global information int…

View →

cs.AIcs.CRRecentMay 30, 2026

Hidden Thoughts Are Not Secret: Reasoning Trace Exposure in LLMs

Yu-An Lu, Ci-Yang Tsai, Yu-Lin Tsai, Raluca Ada Popa +1 more

The paper introduces Reasoning Exposure Prompting (REP), a method that demonstrates that even when LLMs hide their internal reasoning steps from users, useful reasoning supervision can still be elicit…

View →

cs.AIcs.CRRecentMay 30, 2026

Hidden Thoughts Are Not Secret: Reasoning Trace Exposure in LLMs

Yu-An Lu, Ci-Yang Tsai, Yu-Lin Tsai, Raluca Ada Popa +1 more

The paper introduces Reasoning Exposure Prompting (REP), a method that demonstrates that even when LLMs hide internal reasoning traces from users, useful reasoning supervision can still be elicited th…

View →

cs.AIRecentMay 27, 2026

HRBench: Benchmarking and Understanding Thinking-Mode Switch Strategies in Hybrid-Reasoning LLMs

Yansong Ning, Mianpeng Liu, Jingwen Ye, Weidong Zhang +1 more

The paper introduces HRBench, a unified and comprehensive evaluation framework for systematically benchmarking and comparing various thinking-mode switching strategies in hybrid-reasoning LLMs.

View →

cs.AIRecentMay 29, 2026

LinTree: Improving LLM Reasoning with Explicitly Structured Search Histories

Liwei Kang, Yee Whye Teh, Wee Sun Lee

The paper introduces LinTree, a method that explicitly structures the search history of LLM reasoning traces using parent pointers, significantly improving task performance and search efficiency compa…

View →

cs.AIEmpiricalRecentJun 9, 2026

ReasonAlloc: Hierarchical Decoding-Time KV Cache Budget Allocation for Reasoning Models

Wenhao Liu, Hao Shi, Yunhe Li, Weizhi Fei +6 more

This paper proposes a training-free framework called ReasonAlloc to mitigate inference bottlenecks in large language models by recasting decoding-time key-value compression as a hierarchical budget al…

View →

cs.AIEmpiricalRecentJun 9, 2026

ReasonAlloc: Hierarchical Decoding-Time KV Cache Budget Allocation for Reasoning Models

Wenhao Liu, Hao Shi, Yunhe Li, Weizhi Fei +6 more

View →

cs.AIcs.LGRecentMay 27, 2026

Zipping the Thought: When and How Compressed Reasoning Data Works in LLM Post-Training

Kohsei Matsutani, Gouki Minegishi, Takeshi Kojima, Yusuke Iwasawa +1 more

This paper investigates how different types of compressed reasoning data (Explicit, Composed, Implicit CoT) affect LLM performance during post-training, finding that the choice of compression and subs…

View →

cs.CLcs.AIRecentMay 28, 2026

Unlocking the Working Memory of Large Language Models for Latent Reasoning

Lukas Aichberger, Sepp Hochreiter

The paper introduces Reasoning in Memory (RiM), a latent reasoning method that replaces autoregressive token generation with fixed memory blocks to enable compute-efficient internal working memory for…

View →

cs.AIcs.CLcs.LGRecentMay 31, 2026

An Enigma of Artificial Reason: Investigating the Production-Evaluation Gap in Large Reasoning Models

Mingzhong Sun, Teresa Yeo, Armando Solar-Lezama, Tan Zhi-Xuan

This paper investigates the production-evaluation gap in Large Reasoning Models (LRMs), finding that while LRMs excel at generating solutions, they struggle significantly to evaluate flawed reasoning,…

View →

cs.CLcs.AIRecentMay 28, 2026

Teaching Language Models to Check Grounded Claim Factuality with Human Test-Taking Strategies

Yuxuan Ye, Raul Santos-Rodriguez, Edwin Simpson

The paper proposes a novel, efficient method for checking the factuality of claims generated by LLMs by framing it as a true/false reading comprehension task and incorporating explicit test-taking str…

View →

cs.CLcs.AIRecentMay 28, 2026

COFT: Counterfactual-Conformal Decoding for Fair Chain-of-Thought Reasoning in Large Language Models

Arya Fayyazi, Mehdi Kamal, Massoud Pedram

COFT is a training-free decoding method that significantly reduces societal biases in large language model chain-of-thought reasoning by applying token-level fairness control at decode time.

View →

cs.CLcs.AIRecentMay 27, 2026

Integrated and Cross-Architecture Interpretation of LLM Reasoning

Leonardo Matthew Yauw, Wei-Bin Kou, Yujiu Yang

The paper introduces an Integrated, cross-Architecture Reasoning (IAR) framework to provide a unified and robust method for interpreting the opaque reasoning processes within Large Language Models.

View →

cs.AIRecentMay 27, 2026

The Shape of Overthinking: Backtracking Bursts in Long Reasoning Traces

Navid Rezazadeh, Arash Gholami Davoodi

The paper analyzes backtracking dynamics in long reasoning traces to distinguish between useful self-correction and unproductive revision, finding that correct reasoning exhibits early, isolated repai…

View →

cs.CLcs.AIcs.LGRecentMay 29, 2026

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Nianyi Lin, Jiajie Zhang, Lei Hou, Juanzi Li

LongTraceRL addresses long-context reasoning challenges by generating highly challenging training data and introducing a fine-grained rubric reward, significantly improving evidence-grounded reasoning…

View →