~ similar to 2606.00278· 20 results
This paper systematically evaluates the consistency of popular causal discovery benchmarks against real-world scientific literature, revealing significant variability in their accuracy.
This paper introduces an entropy-based method to generate multiple plausible causal maps (atlases) that accurately reflect the inherent structural ambiguity in complex systems, moving beyond single, o…
Zizhen Deng, Jiaru Zhang, Rui Ding, Huang Bojun +4 more
The paper proposes Test-Time Training for Supervised Causal Learning (TTT-SCL), a novel framework that dynamically generates training data aligned with specific test instances to significantly improve…
The paper formalizes the concept of a causal pathway for rare events, showing that testable implications can be derived solely from this pathway abstraction, simplifying complex causal modeling.
This paper evaluates the causal reasoning abilities of large language models and finds that they rely heavily on lexical pattern matching rather than structural reasoning.
Zheng Lu, Mingqi Gao, Qinlei Xie, Wanqi Zhong +7 more
The paper argues that current embodied planning benchmarks prioritize superficial language prediction over true physical reasoning, introducing new benchmarks and a large-scale dataset to demonstrate…
Giuliano Martinelli, Piriyakorn Piriyatamwong, Abelardo Carlos Martinez Lorenzo, Jasmin Baier +6 more
The paper introduces Query2Effect, a large-scale benchmark, and a two-step framework to predict causal effect sizes from natural language queries, showing that structured representation significantly…
The paper introduces causal density functions, which are local density ratios that allow for the pointwise estimation and scoring of directed causal influence by comparing interventional and observati…
Zihan Chen, Yiming Zhang, Wenxiang Geng, Zenghui Ding +1 more
The paper theoretically explains that optimizing LLMs solely on outcomes leads to brittle reasoning (Reward-Induced Manifold Collapse) by favoring low-complexity shortcuts, and proposes process-based…
Shuaike Li, Kai Zhang, Xianquan Wang, Jiachen Liu +1 more
The paper introduces Causal Editing (CODE), a new paradigm that improves knowledge updates in LLMs by grounding fact injection in causal narratives, drastically reducing self-refutation rates.
The paper compares verbalized feature attributions and self-generated rationales for explaining model behavior, finding that the format and granularity of the explanation significantly affect its abil…
The paper challenges the conclusion that LLMs lack reasoning by demonstrating that reported performance drops on GSM-Symbolic are often statistically weak and partially attributable to dataset biases,…
Nizar Islah, Istabrak Abbes, Irina Rish, Sarath Chandar +1 more
This paper proposes a method to recover recoverability structure from failed traces of post-trained language models, enabling test-time routing and post-training analysis.
The paper introduces TRAILS~, a novel method that improves code correctness validation by grounding LLM reasoning in concrete (input, output) pairs derived from specifications, achieving state-of-the-…
This survey provides a comprehensive analysis of Reasoning Language Model (RLM) adoption across 28 scientific disciplines, revealing significant disparities in RLM maturity across different scientific…
The paper proposes DA-GC, a certified causal attribution framework that accurately identifies cross-slice attack origins in 6G networks under strict real-time latency constraints by systematically mod…
Haoming Xu, Weihong Xu, Zongrui Li, Mengru Wang +5 more
The paper introduces Contextual Belief Management (CBM) to address how LLMs should manage accumulating information over long interactions, showing that reinforcement learning significantly improves be…
This paper investigates the production-evaluation gap in Large Reasoning Models (LRMs), finding that while LRMs excel at generating solutions, they struggle significantly to evaluate flawed reasoning,…
The paper proposes graph-coupled causal Bayesian optimization, a method that improves efficiency by sharing information across related interventions through a shared set of causal parameters.
This paper introduces topological-geometrical metrics to estimate structural causal effects that are missed by traditional mean-based methods, proposing a new concept called topological ignorability.