Papers similar to 2605.28144

~ similar to 2605.28144· 18 results

cs.CVcs.AIcs.CLRecentMay 29, 2026

SpatialAct: Probing Spatial Reasoning-to-Action Capabilities of VLM Agents in 3D Scenes

Tianhui Liu, Jie Feng, Zhiheng Zheng, Shengyuan Wang +5 more

The paper introduces SpatialAct, a challenging benchmark that reveals a significant 'reasoning-to-action gap,' showing that current VLMs struggle to maintain coherent spatial understanding and perform…

View →

cs.AIRecentMay 27, 2026

Do LLMs Build World Models From Text? A Multilingual Diagnostic of Spatial Reasoning

Zhikai Pan, Chih-Ting Liao, Chunrui Liu, Xi Xiao +4 more

The paper introduces a multilingual benchmark (MentalMap) to test if LLMs build internal spatial world models from text, finding a universal 'L3 reasoning cliff' suggesting that text-only working memo…

View →

cs.CLcs.AIRecentMay 29, 2026

The Sword, Shield, and Achilles' Heel: Characterizing the Linguistic Inductive Bias of Large Language Models for Spatial Reasoning in Navigation Planning

Xudong Zhang, Jian Yang, Shengkai Wang, Jiangpeng Tian +4 more

The paper proposes a dual-interventional framework to characterize how linguistic structures and contextual cues influence LLMs' spatial reasoning for navigation, finding that topological information…

View →

cs.CVcs.AIEmpiricalRecentJun 11, 2026

SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning

Seokju Cho, Ryo Hachiuma, Abhishek Badki, Hang Su +7 more

This paper proposes SpatialClaw, a training-free framework for spatial reasoning that enables open-ended, complex 3D/4D spatial reasoning.

View →

cs.AIRecentMay 27, 2026

Do Agents Think Deeper? A Mechanistic Investigation of Layer-Wise Dynamics in Sequential Planning

Zhenyu Cui, Xiangzhong Luo

The paper investigates how LLMs allocate their internal computational depth during multi-turn agentic planning, finding that agents progressively recruit deeper layers and shift toward corrective upda…

View →

cs.CVRecentJun 1, 2026

Active Exploring like a Pigeon: Reinforcing Spatial Reasoning via Agentic Vision-Language Models

Wei Deng, Xianlin Zhang, Mengshi Qi

The paper proposes an agentic pipeline for spatial reasoning by introducing a dynamic cognitive map and Spatial Assertion Codes (SAC), achieving state-of-the-art performance on complex spatial tasks.

View →

cs.AIRecentMay 29, 2026

Planner-Centric Reinforcement Learning for Deep Research with Structure-Aware Reward

Mustafa Anis Hussain, Xinle Wu, Yao Lu

The paper proposes DecomposeR, a planner-centric framework that structures deep research into typed Directed Acyclic Graphs (DAGs) to explicitly improve the planning and execution of large language mo…

View →

cs.AIRecentMay 28, 2026

Meta-Cognitive Memory Policy Optimization for Long-Horizon LLM Agents

Ziyan Liu, Zhezheng Hao, Yeqiu Chen, Hong Wang +6 more

The paper introduces Metacognitive Memory Policy Optimization (MMPO), a novel memory training approach that optimizes LLM memory not based on final task success, but on minimizing epistemic uncertaint…

View →

cs.CVcs.AIcs.CLRecentMay 31, 2026

Distilling Neuro-Symbolic Programs into 3D Multi-modal LLMs

Wentao Mo, Yang Liu

The paper introduces APEIRIA, a neuro-symbolic 3D Multi-modal LLM that bridges the gap between interpretable symbolic reasoning and flexible, open-vocabulary 3D understanding.

View →

cs.CLRecentMay 31, 2026

Thinking Economically: A Hierarchical Framework for Adaptive-Complexity Reasoning in LLMs

Yubo Gao, Haotian Wu, Hong Chen, Junquan Huang +7 more

The paper introduces Hierarchical Adaptive Budgeter (HAB), a framework that improves LLM reasoning efficiency by adaptively allocating computational resources to match the intrinsic complexity of both…

View →

cs.AIRecentMay 28, 2026

MINDGAMES: A Live Arena for Evaluating Social and Strategic Reasoning in Multi-Agent LLMs

Kevin Wang, Anna Thöni, Benjamin Kempinski, Bobby Cheng +49 more

The paper introduces Mindgames, a comprehensive multi-game arena for evaluating LLM agents' sustained social and strategic reasoning, demonstrating that current evaluations are limited by structural s…

View →

cs.AIcs.CLRecentMay 28, 2026

Locally Coherent, Globally Incoherent: Bounding Compositional Incoherence in Multi-Component LLM Agents

Anany Kotawala

The paper introduces a metric, the compositional residual eps*, to quantify how multi-component LLM agents violate basic probability axioms when combining local, coherent claims into a global predicti…

View →

cs.AIRecentMay 28, 2026

LLM-Evolved Domain-Independent Heuristics for Symbolic AI Planning

Elliot Gestrin, Jendrik Seipp

This paper introduces the first LLM-generated, domain-independent heuristics for symbolic AI planning, using evolutionary search to surpass the performance of hand-engineered state-of-the-art methods.

View →

cs.CLRecentJun 1, 2026

Geometric Latent Reasoning Induces Shorter Generations in LLMs

Shashi Kumar, Yacouba Kaloga, Petr Motlicek, Ina Kodrasi +1 more

The paper introduces Geometric Latent Reasoning (GLR), a method that models reasoning as continuous paths in the embedding space, showing that this continuous approach allows LLMs to solve problems us…

View →

cs.AIRecentJun 1, 2026

Token Predictors Are Not Planners: Building Physically Grounded Causal Reasoners

Zheng Lu, Mingqi Gao, Qinlei Xie, Wanqi Zhong +7 more

The paper argues that current embodied planning benchmarks prioritize superficial language prediction over true physical reasoning, introducing new benchmarks and a large-scale dataset to demonstrate…

View →

cs.AIRecentMay 29, 2026

LinTree: Improving LLM Reasoning with Explicitly Structured Search Histories

Liwei Kang, Yee Whye Teh, Wee Sun Lee

The paper introduces LinTree, a method that explicitly structures the search history of LLM reasoning traces using parent pointers, significantly improving task performance and search efficiency compa…

View →

cs.CLcs.AIRecentJun 1, 2026

PlanarBench: Evaluating LLM Spatial Reasoning via Planar Graph Drawing

Oleksandr Nikitin

PlanarBench introduces a novel benchmark to test LLM spatial reasoning by requiring them to draw planar graphs as ASCII art from an edge list, finding that edge count is a stronger difficulty predicto…

View →

cs.CVcs.CLRecentMay 31, 2026

Reasmory: 3D Reconstruction as Explicit Memory for VLMs Spatial Reasoning

Jixuan He, Xueting Li, Chieh Hubert Lin, Ming-Hsuan Yang

Reasmory introduces a structured programming framework that uses explicit 3D memory and a Domain-Specific Language (DSL) to reliably enhance Vision-Language Models' spatial reasoning capabilities, ach…

View →