~ similar to 2605.30747· 20 results
Liangyi Huang, Zichen Liu, Fei Shao, Shang Ma +4 more
The paper introduces GRID, an end-to-end framework that significantly improves the construction of security knowledge graphs from cyber threat intelligence by replacing unstable LLM-based supervision…
The paper introduces GraphARC, a new benchmark for abstract reasoning on graph-structured data, demonstrating that current state-of-the-art language models struggle with full graph transformation task…
BIRDNet is a novel, sparse, and interpretable deep neural network that encodes Boolean implication knowledge mined directly from tabular data, achieving performance comparable to dense models while dr…
The paper introduces a novel LLM-driven evolutionary framework to synthesize admissible, domain-specific pattern generators, enabling optimal classical planning with high performance and interpretabil…
Shenghao Ye, Yu Guo, Zhengheng Li, Shuangwu Chen +1 more
The paper proposes RoRo, a rubric-guided process reward framework that improves stepwise model routing by evaluating the quality of intermediate reasoning steps, leading to better performance and cost…
This paper introduces the first LLM-generated, domain-independent heuristics for symbolic AI planning, using evolutionary search to surpass the performance of hand-engineered state-of-the-art methods.
Zheng Yuan, Chuang Zhou, Linhao Luo, Siyu An +3 more
MoG proposes a novel Mixture of Experts framework for graph-based RAG, which uses hub graphs to guide the sparse activation of domain-specific expert graphs, significantly improving retrieval accuracy…
The paper introduces Entropy-Cut Metropolis-Hastings, an efficient sampling method that uses next-token entropy to identify and resample from critical decision points in a reasoning trace, significant…
This paper provides the first longitudinal analysis of log-based detection rule evolution in public repositories, finding that rule changes reflect ongoing operational trade-offs rather than steady co…
The paper introduces HERO'S JOURNEY, a benchmark for testing complex rule induction in text games, finding that while LLMs show limited rule induction ability, procedural tasks remain a significant ch…
Wangyi Mei, Zhouhong Gu, Zhenhan Bai, Yin Cai +8 more
The paper proposes Deep Research as Rubric (DR-rubric), a novel evidence-driven framework that treats rubric construction itself as a research problem to generate fine-grained, scalable reward signals…
This paper systematically evaluates the consistency of popular causal discovery benchmarks against real-world scientific literature, revealing significant variability in their accuracy.
MOOSE-Copilot is a novel web-based framework that unifies scientific hypothesis discovery by formalizing human-AI interaction, significantly improving performance over autonomous LLM baselines.
Zakk Heile, Hayden McTavish, Varun Babbar, Margo Seltzer +1 more
The paper introduces PRAXIS, a novel algorithm that efficiently approximates the computation of 'Rashomon sets' for decision trees, significantly reducing memory and runtime complexity.
Yuchen Liu, Yingjie Feng, Lixiong Qin, Jiasi Chen +4 more
The paper introduces Graph-Distance Contribution Reward (GDCR) and Step Advantage Policy Optimization (SAPO) to provide fine-grained, step-level credit assignment for agentic search by modeling world…
Pengyu Chen, Yonggang Zhang, Mingming Chen, Jun Song +2 more
The paper proposes a graph-constrained approach to scale multi-hop training data by decoupling path discovery from path verbalization, significantly expanding the usable corpus size for LLMs.
This paper localizes the attention heads within LLMs responsible for specific reasoning steps, finding that specialized heads handle factual retrieval while higher layers manage global information int…
The paper introduces ProvMind, a provenance-grounded reasoning framework that significantly improves materials synthesis process optimization by accurately predicting optimal synthesis routes under ch…
This paper proposes a scalable topological learning framework for higher-order graph representation by introducing simplified and factored cellular Weisfeiler Leman tests and a novel random walk metho…
The paper introduces LinTree, a method that explicitly structures the search history of LLM reasoning traces using parent pointers, significantly improving task performance and search efficiency compa…