20 results for “information retrieval”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
The paper proposes MIMO, a two-stage framework that improves Multilingual Information Retrieval (MLIR) by stabilizing cross-lingual alignment and enhancing retrieval discrimination using a combination…
The paper introduces SPECTRA, a scalable framework for generating large, synthetic, and controllable information retrieval test collections, demonstrating its ability to expose system scaling and fail…
Lixuan Guo, Yifei Wang, Tiansheng Wen, Aosong Feng +2 more
The paper introduces Single-stage Sparse Retrieval (SSR), a method that replaces computationally expensive vector clustering with sparse autoencoding to achieve highly efficient multi-vector retrieval…
SkillPager is a novel two-stage framework that efficiently selects minimal, execution-sufficient context from large procedural skill documents by leveraging typed semantic nodes, significantly reducin…
The paper proposes InSemRAG, an enhanced RAG framework that improves retrieval accuracy and knowledge integrity by incorporating intent-aware retrieval and semantics-preserving chunking, achieving sta…
Joongmin Shin, Gyuho Shim, Jeongbae Park, Jaehyung Seo +1 more
HiKEY proposes a hierarchical, tree-based multimodal retrieval framework that significantly improves open-domain document question answering by addressing document routing and evidence fragmentation.
Alireza Salemi, Chang Zeng, Atharva Nijasure, Jui-Hui Chung +3 more
GrepSeek introduces a novel direct corpus interaction (DCI) search agent that trains an LLM to find and compose evidence from large text corpora by issuing executable shell commands, achieving state-o…
This paper proposes a multi-turn retrieval-augmented generation pipeline for conversational systems across four domains.
Jinheon Baek, Soyeong Jeong, Sangwoo Park, Woongyeong Yeo +4 more
OmniRetrieval introduces a unified framework that handles natural language queries across diverse, heterogeneous knowledge sources (text, relational, graphs) by dispatching source-native queries witho…
Ziyu Song, Jiaming Fang, Kuangyu Li, Tuo Xia +1 more
This paper proposes Tail-Aware Adaptive-k (TAA-k), a training-free framework for adaptive context selection in retrieval-augmented generation systems using Extreme Value Theory.
Zhen Chen, Yibing Liu, Weihao Xie, Yu Liang +2 more
The paper proposes formulating RAG design as an architecture search problem and introduces RAISE, a comprehensive framework and benchmark for systematically optimizing RAG hyperparameters.
The paper introduces a Deep Research pipeline that significantly improves literature search recall and demonstrates that human-curated citation lists are often unreliable and do not serve as a true gr…
This paper proposes a lightweight encoder-based MEL solution called FAST-MEL that meets three objectives: high linking accuracy, computational efficiency, and storage efficiency.
The paper systematically compares multiple content representations for RAG pipelines and finds that answer retention—the ability of the representation to preserve the original answer-bearing content—i…
Xuan Lu, Haohang Huang, Yingqi Fan, Junlong Tong +4 more
This paper proposes CompRank, a token-efficient reranking framework for large language models that reduces redundant computation and achieves strong reranking performance.
Chengcai Gao, Zhihong Sun, Xiaochuan Shi, Qiufeng Wang +1 more
The paper proposes BiRD, a bidirectional ranking defense mechanism that enhances the robustness of Retrieval-Augmented Generation (RAG) against adversarial attacks by analyzing the alignment between f…
The paper introduces Latent Terms, a method that shows dense retrieval models implicitly learn sparse, Zipfian vocabularies that can be used for classical BM25-style sparse scoring without requiring s…
Jiaman He, Riccardo Xia, Dana McKay, Damiano Spina +1 more
The paper presents SearchLog, a web browser extension for collecting natural search logs during lab-based studies.
Siyuan Qi, Xinyuan Wang, Yingxuan Yang, Haochuan Guo +4 more
DynaTree introduces a two-stage framework that pre-constructs a reusable retrieval tree offline using coordinated agents, allowing for efficient, structure-aware, and highly effective time-sensitive n…