Papers similar to 2605.31492

~ similar to 2605.31492· 20 results

cs.AIcs.LGRecentMay 27, 2026

Tree of Thoughts as a Classical Heuristic Search Problem: Formal Foundations and Design Patterns

This paper unifies the fragmented field of Tree-of-Thoughts (ToT) reasoning by mapping LLM-based search processes onto a formal taxonomy derived from classical heuristic search theory.

View →

cs.AIRecentMay 28, 2026

LLM-Evolved Domain-Independent Heuristics for Symbolic AI Planning

Elliot Gestrin, Jendrik Seipp

This paper introduces the first LLM-generated, domain-independent heuristics for symbolic AI planning, using evolutionary search to surpass the performance of hand-engineered state-of-the-art methods.

View →

cs.CLcs.AIcs.LGRecentMay 29, 2026

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Nianyi Lin, Jiajie Zhang, Lei Hou, Juanzi Li

LongTraceRL addresses long-context reasoning challenges by generating highly challenging training data and introducing a fine-grained rubric reward, significantly improving evidence-grounded reasoning…

View →

cs.AIRecentMay 27, 2026

HRBench: Benchmarking and Understanding Thinking-Mode Switch Strategies in Hybrid-Reasoning LLMs

Yansong Ning, Mianpeng Liu, Jingwen Ye, Weidong Zhang +1 more

The paper introduces HRBench, a unified and comprehensive evaluation framework for systematically benchmarking and comparing various thinking-mode switching strategies in hybrid-reasoning LLMs.

View →

cs.AIRecentJun 1, 2026

LLM-Evolved Pattern Generators for Optimal Classical Planning

Windy Phung, Dominik Drexler, Arnaud Lequen, Jendrik Seipp

The paper introduces a novel LLM-driven evolutionary framework to synthesize admissible, domain-specific pattern generators, enabling optimal classical planning with high performance and interpretabil…

View →

cs.AIcs.CLcs.LORecentMay 27, 2026

Satisfiability Solving with LLMs: A Matched-Pair Evaluation of Reasoning Capability

Leizhen Zhang, Shuhan Chen, Sheng Chen

The paper evaluates LLM reasoning on Boolean satisfiability (SAT) problems, concluding that conventional metrics are misleading and proposing a paired-formula protocol with Accurate Differentiation Ra…

View →

cs.AIcs.CLRecentMay 27, 2026

Revealing Algorithmic Deductive Circuits for Logical Reasoning

Phuong Minh Nguyen, Tien Huu Dang, Naoya Inoue

This paper localizes the attention heads within LLMs responsible for specific reasoning steps, finding that specialized heads handle factual retrieval while higher layers manage global information int…

View →

cs.AIcs.CRcs.SERecentMar 19, 2026

Implicit Patterns in LLM-Based Binary Analysis

Qiang Li, XiangRui Zhang, Haining Wang

This paper analyzes large-scale reasoning traces from LLM-based binary vulnerability analysis, identifying four structured, token-level implicit patterns that govern how LLMs explore code paths.

View →

cs.AIcs.LORecentMay 28, 2026

Reliable Reasoning with Large Language Models via Preference-Based Maximum Satisfiability

Pedro Orvalho, Marta Kwiatkowska, Guillem Alenyà, Felip Manyà

The paper proposes a hybrid reasoning framework where Large Language Models (LLMs) generate code to encode complex optimization problems into a preference-based Maximum Satisfiability (MaxSAT) format,…

View →

cs.AIRecentMay 27, 2026

Plan Before Search: Search Agents Need Plan

Zhipeng Qian, Zihan Liang, Yufei Ma, Ben Chen +6 more

The paper introduces Plan, a structured agentic behavior that decomposes multi-hop questions into ordered sub-questions before retrieval, and proposes a self-bootstrapping paradigm to train it without…

View →

cs.AIRecentMay 27, 2026

Do LLMs Build World Models From Text? A Multilingual Diagnostic of Spatial Reasoning

Zhikai Pan, Chih-Ting Liao, Chunrui Liu, Xi Xiao +4 more

The paper introduces a multilingual benchmark (MentalMap) to test if LLMs build internal spatial world models from text, finding a universal 'L3 reasoning cliff' suggesting that text-only working memo…

View →

cs.CLcs.AIcs.LGRecentMay 28, 2026

Does The Way You Plan Matter? An Empirical Study of Planning Representations for LLM Web Agents

Alejandra Zambrano, Sara Vera Marjanovic, Imene Kerboua, Xing Han Lù +1 more

This paper empirically demonstrates that the choice of plan representation (e.g., checklist vs. narrative) significantly impacts the robustness and success rate of LLM-based web agents.

View →

cs.CLRecentMay 29, 2026

ExpGraph: Model-Agnostic Experience Learning with Graph-Structured Memory for LLM Agents

Tao Feng, Chongrui Ye, Tianyang Luo, Jingjun Xu +7 more

ExpGraph is a model-agnostic framework that uses a self-evolving experience graph to enable LLM agents to reuse past successful strategies and failure lessons, significantly improving performance acro…

View →

cs.CLcs.IRRecentJun 3, 2026

Caliper: Probing Lexical Anchors versus Causal Structure in LLMs

Zhenyu Yu, Shuigeng Zhou

This paper evaluates the causal reasoning abilities of large language models and finds that they rely heavily on lexical pattern matching rather than structural reasoning.

View →

cs.CLcs.AIcs.IRRecentMay 28, 2026

GrepSeek: Training Search Agents for Direct Corpus Interaction

Alireza Salemi, Chang Zeng, Atharva Nijasure, Jui-Hui Chung +3 more

GrepSeek introduces a novel direct corpus interaction (DCI) search agent that trains an LLM to find and compose evidence from large text corpora by issuing executable shell commands, achieving state-o…

View →

cs.CLRecentMay 31, 2026

On the Generalization Gap in Self-Evolving Language Model Reasoning

Zhenting Qi, Susanna Maria Baby, Stefanie Anna Baby, Kan Yuan +4 more

The paper investigates the limits of self-evolution in LLM reasoning under closed-loop settings, finding that while self-improvement is significant, it consistently falls short of perfect oracle super…

View →

cs.CLcs.AIRecentMay 28, 2026

Unlocking the Working Memory of Large Language Models for Latent Reasoning

Lukas Aichberger, Sepp Hochreiter

The paper introduces Reasoning in Memory (RiM), a latent reasoning method that replaces autoregressive token generation with fixed memory blocks to enable compute-efficient internal working memory for…

View →

cs.AIRecentMay 27, 2026

Deconstructing Spatial Complexity: Hierarchical Decomposition for LLM Spatial Reasoning

Yi Wang, Haojie Lu, Zhaofan Zhang, Li Chen +1 more

This paper introduces MCTS-Guided Group Relative Policy Optimization (M-GRPO) to enhance LLM spatial reasoning by improving the decomposition of complex tasks into optimal sub-tasks.

View →

cs.AIRecentMay 28, 2026

TRACE: Toulmin-based Reasoning Assessment through Constructive Elements for LLM CoT Evaluation

Yundong Kim, Heyoung Yang

The paper introduces TRACE, a novel metric that evaluates the logical structure of LLM reasoning (CoT) by integrating Toulmin's argumentation theory, demonstrating that sound reasoning structure corre…

View →

cs.CLcs.AIRecentJun 1, 2026

Learning When to Translate for Multilingual Reasoning

Deokhyung Kang, Hyounghun Kim, Gary Geunbae Lee

The paper proposes Luar, a framework that trains reasoning language models to selectively use English translation only when their direct understanding of a non-English input is unreliable, significant…

View →