20 results for “Incomplete information”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
Renjie Gu, Jiaxu Li, Yihao Wang, Yun Yue +7 more
The paper addresses the 'detection-to-abstention gap' in reasoning models, where detecting insufficient information does not lead to abstention, by proposing a novel control framework that forces mode…
The paper introduces ASTRAL, a multimodal LLM-driven framework that reconstructs and analyzes fragmented cyber-physical system architectures to enable comprehensive and quantitative security risk asse…
Jan Pennekamp, Johannes Lohmöller, David Schütte, Joscha Loos +1 more
This paper systematically analyzes 2.7 million arXiv submissions to demonstrate that nearly every preprint unintentionally discloses sensitive or unnecessary information through its source files, prop…
This paper settles the complexity of three sketching problems in graphs and distributions.
The paper introduces a diagnostic benchmark for selective Question Answering over conflicting, multi-source personal memory, demonstrating that specialized fusion resolvers outperform general LLMs, es…
HuiMing Fan, Xiao Wang, Zheng Chu, Qianyu Wang +4 more
The paper argues that current search agents often verify existing knowledge rather than genuinely searching, and introduces LiveBrowseComp, a new benchmark to measure true evidence-driven discovery.
The paper introduces Inconsistency-Aware Minimization (IAM), a novel training objective that uses a label-free measure called local inconsistency to improve model generalization, particularly in semi-…
Sunisth Kumar, Xanh Ho, Tim Schopf, Andre Greiner-Petter +2 more
The paper explains the 'table-chart gap' in scientific claim verification by showing that multimodal LLMs successfully encode information from charts but fail to route it to the final prediction layer…
The paper simulates bargaining scenarios using LLM agents to analyze how optimizing agents for financial profit affects their honesty and trust, finding that while fine-tuning improves deal-making, it…
The study demonstrates that LLMs exhibit significant, language-driven disparities in medical triage recommendations, recommending emergency care more frequently for English and Arabic prompts, even wh…
This paper introduces a framework to audit source-dependence in multi-source RAG systems, demonstrating that disagreement across institutional sources is a common and critical failure mode that curren…
Bowen Tian, Caixue He, Jiemin Wu, Jingying Wang +3 more
AnyEdit++ introduces a structure-aware framework that uses Bayesian Surprise to adaptively segment long-form knowledge, significantly improving the coherence and accuracy of knowledge editing in LLMs.
This paper investigates the 'faithfulness gap' in LLM agents—the discrepancy between stated reasoning and actual action—by decomposing it into two opposing steps: reasoning-to-conclusion and conclusio…
The paper introduces Factual Density (FD*), a novel retrieval signal that measures the proportion of verified facts, demonstrating that optimizing RAG retrieval based on this density significantly imp…
This paper introduces cost-aware Retrieval-Augmented Generation (RAG), demonstrating that fixed evidence selection is brittle and that adaptive, agentic controllers are necessary for effective knowled…
The paper proposes a deterministic, version-aware aggregation method that significantly outperforms existing LLM-based systems for resolving memory conflicts in fact consolidation tasks.
The paper proposes using question-asking as an inference-time intervention to probe a language model's hidden state, finding that the self-diagnosis process provides a predictive signal for final corr…
Wanying Ren, Xin Song, Futing Wang, Guoxiu He +1 more
The paper theoretically analyzes the limitations of parameter-based knowledge editing and empirically demonstrates that these methods consistently damage core LLM capabilities compared to retrieval-ba…
Thomas Humphries, Tim Li, Shufan Zhang, Karl Knopf +1 more
The paper introduces PostRI, a novel method that allows for computing a Randomization Interval (RI) for differentially private median queries after the median has already been estimated, significantly…
Chen He, Yuhao Wu, Lei Wang, Wenxuan Zhang +1 more
The paper identifies and demonstrates that post-conclusion continuation in answer-correct long-CoT traces is harmful during LLM fine-tuning, proposing a method to cut this continuation.