20 results for “Question answering”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
Zhensheng Wang, Xiaole Liu, Wenmian Yang, Kun Zhou +2 more
The paper introduces Open-Domain Tabular Question Answering for Future Data Forecasting and Reasoning, a new dataset and framework that enables LLMs to perform time-series forecasting and reasoning on…
HypothesisMed introduces an inference-time pipeline for biomedical question answering that improves model reliability and structured output generation by fusing multiple model outputs and diagnosing t…
RASER introduces a family of cheap, router-based systems that selectively decide whether to perform expensive multi-hop retrieval, significantly reducing LLM token costs while maintaining state-of-the…
The paper introduces Brain-IT-VQA, a novel framework that significantly improves visual question answering from fMRI signals, and presents NSD-VQA, a new, highly controlled dataset for this task.
Chengtao Gan, Zhiqiang Liu, Long Jin, Yushan Zhu +2 more
CRAFTQA introduces a novel adaptive, code-driven framework that significantly enhances complex structured data reasoning by dynamically generating custom code functions beyond predefined operations.
The paper compares verbalized feature attributions and self-generated rationales for explaining model behavior, finding that the format and granularity of the explanation significantly affect its abil…
The paper proposes using question-asking as an inference-time intervention to probe a language model's hidden state, finding that the self-diagnosis process provides a predictive signal for final corr…
This paper proposes a multi-turn retrieval-augmented generation pipeline for conversational systems across four domains.
The paper introduces CAN-QA, a novel question-answering benchmark that reformulates CAN traffic analysis from a classification task to a reasoning task, demonstrating that current LLMs struggle with c…
Siddhesh Milind Pawar, Sarah Masud, Haneul Yoo, Alice Oh +1 more
The paper introduces FRANZ, a communicative audit framework, to evaluate how LLMs frame responses to subjective questions, finding that LLMs exhibit statistically significant and coupled differences i…
The paper introduces OCC-RAG, a family of compact, task-specialized Small Language Models (SLMs) designed to achieve highly faithful, multi-hop question answering grounded strictly in provided context…
Yuxin Wang, Jiahao Lu, Qifeng Wu, Shicheng Fang +4 more
AdaptR1 is a novel Reinforcement Learning framework that adaptively manages reasoning effort at every step of multi-hop Question Answering, significantly reducing unnecessary computational cost withou…
The paper systematically compares multiple content representations for RAG pipelines and finds that answer retention—the ability of the representation to preserve the original answer-bearing content—i…
DeSQ is a novel, KB-agnostic framework that improves Knowledge Base Question Answering by decomposing complex questions into atomic constraints and generating structured SPARQL queries, achieving supe…
This paper demonstrates that retrieval-augmented in-context learning systems for document QA are vulnerable to membership inference attacks, proposing novel black-box methods that exploit query prefix…
Qiming Shi, Zhaolu Kang, Yunfan Zhou, Di Weng +1 more
SPADER is a novel reinforcement learning framework that addresses the challenges of Multi-Answer Question Answering by improving credit assignment and promoting diverse exploration during long-horizon…
Alireza Salemi, Chang Zeng, Atharva Nijasure, Jui-Hui Chung +3 more
GrepSeek introduces a novel direct corpus interaction (DCI) search agent that trains an LLM to find and compose evidence from large text corpora by issuing executable shell commands, achieving state-o…
This paper investigates if team-based interaction improves LLM performance on complex reasoning tasks (ChGK), finding that structured team strategies significantly boost accuracy by acting as error-fi…
Zhipeng Qian, Zihan Liang, Yufei Ma, Ben Chen +6 more
The paper introduces Plan, a structured agentic behavior that decomposes multi-hop questions into ordered sub-questions before retrieval, and proposes a self-bootstrapping paradigm to train it without…
Yansong Ning, Mianpeng Liu, Jingwen Ye, Weidong Zhang +1 more
The paper introduces HRBench, a unified and comprehensive evaluation framework for systematically benchmarking and comparing various thinking-mode switching strategies in hybrid-reasoning LLMs.