~ similar to 2605.31550· 19 results
Leo Luo, Haining Xie, Siqi Shen, Zhipeng Ma +7 more
SIRIUS-SQL introduces a robust multi-candidate text-to-SQL system that addresses weaknesses in candidate generation, error handling, and selection, achieving state-of-the-art performance on complex be…
Aravind Mandiga, Guoming Li, Jin Lu, Ismailcem Budak Arpinar +2 more
The paper introduces ProtStructQA, an executable benchmark that tests protein structural reasoning by requiring language models to generate measurable 3D coordinates, revealing a capability-dependent…
Huawei Zheng, Sen Yang, Zhaorui Yang, Yuhui Zhang +11 more
EviLink addresses the ambiguity of schema linking in Text-to-SQL by treating it as an uncertainty-aware inference over multiple plausible SQL paths, significantly improving recall and efficiency.
The paper proposes InSemRAG, an enhanced RAG framework that improves retrieval accuracy and knowledge integrity by incorporating intent-aware retrieval and semantics-preserving chunking, achieving sta…
The paper introduces REBench, a comprehensive, standardized benchmark dataset designed to enable fair and rigorous evaluation of Large Language Models (LLMs) on complex binary reverse engineering task…
This paper proposes a multi-turn retrieval-augmented generation pipeline for conversational systems across four domains.
Joongmin Shin, Gyuho Shim, Jeongbae Park, Jaehyung Seo +1 more
HiKEY proposes a hierarchical, tree-based multimodal retrieval framework that significantly improves open-domain document question answering by addressing document routing and evidence fragmentation.
Pengyu Chen, Yonggang Zhang, Mingming Chen, Jun Song +2 more
The paper proposes a graph-constrained approach to scale multi-hop training data by decoupling path discovery from path verbalization, significantly expanding the usable corpus size for LLMs.
The paper introduces Self-Conditioned Positional HNSW (SCP-HNSW), a method that modifies chunk embeddings and retrieval process to mitigate redundant evidence retrieval from overlapping document chunk…
The paper systematically compares multiple content representations for RAG pipelines and finds that answer retention—the ability of the representation to preserve the original answer-bearing content—i…
This study systematically evaluates a wide range of chunking methods for Retrieval-Augmented Generation (RAG) to assess their effectiveness and highlight the overlooked challenges associated with chun…
Divya Tadimeti, Shawn Pan, Sameera Lanka, Chenghui Zhou +1 more
This paper demonstrates that targeted adaptation of the small language model Phi Silica, using dataset curation and fine-tuning, significantly improves its performance in short-form text rewriting, na…
RASER introduces a family of cheap, router-based systems that selectively decide whether to perform expensive multi-hop retrieval, significantly reducing LLM token costs while maintaining state-of-the…
The paper introduces Entity-Collision, a rigorous protocol that separates genuine retrieval lift from simple lexical overlap, demonstrating that embedder performance depends critically on the query ty…
This paper evaluates the causal reasoning abilities of large language models and finds that they rely heavily on lexical pattern matching rather than structural reasoning.
The paper addresses 'Template Collapse' in 3D CT report generation—where models generate generic reports—by proposing CLarGen, a decoupled framework that significantly improves clinical accuracy and d…
Zhensheng Wang, Xiaole Liu, Wenmian Yang, Kun Zhou +2 more
The paper introduces Open-Domain Tabular Question Answering for Future Data Forecasting and Reasoning, a new dataset and framework that enables LLMs to perform time-series forecasting and reasoning on…
SkillPager is a novel two-stage framework that efficiently selects minimal, execution-sufficient context from large procedural skill documents by leveraging typed semantic nodes, significantly reducin…
DeSQ is a novel, KB-agnostic framework that improves Knowledge Base Question Answering by decomposing complex questions into atomic constraints and generating structured SPARQL queries, achieving supe…