Papers similar to 2606.01070

~ similar to 2606.01070· 18 results

cs.IREmpiricalRecentJun 10, 2026

CompRank: Efficient LLM Reranking via Token-Level Compression and Decoding-Free Scoring

Xuan Lu, Haohang Huang, Yingqi Fan, Junlong Tong +4 more

This paper proposes CompRank, a token-efficient reranking framework for large language models that reduces redundant computation and achieves strong reranking performance.

View →

cs.IREmpiricalRecentJun 10, 2026

Tail-Aware Adaptive-k: Query-Adaptive Context Selection for Retrieval-Augmented Generation

Ziyu Song, Jiaming Fang, Kuangyu Li, Tuo Xia +1 more

This paper proposes Tail-Aware Adaptive-k (TAA-k), a training-free framework for adaptive context selection in retrieval-augmented generation systems using Extreme Value Theory.

View →

cs.AIcs.IRcs.LGRecentMay 28, 2026

CoHyDE: Iterative Co-Training of LLM Rewriter & Dense Encoder for Tool Retrieval

Vaishali Senthil, Ashutosh Hathidara, Sebastian Schreiber

CoHyDE introduces an iterative co-training framework that jointly optimizes an LLM rewriter and a dense encoder, significantly improving tool retrieval accuracy for LLM agents, especially on vague que…

View →

cs.IRcs.AIcs.LGRecentMay 28, 2026

No More K-means: Single-Stage Sparse Coding for Efficient Multi-Vector Retrieval

Lixuan Guo, Yifei Wang, Tiansheng Wen, Aosong Feng +2 more

The paper introduces Single-stage Sparse Retrieval (SSR), a method that replaces computationally expensive vector clustering with sparse autoencoding to achieve highly efficient multi-vector retrieval…

View →

cs.CLcs.IREmpiricalRecentJun 10, 2026

uva-irlab-conv at SemEval-2026 Task 8: Multi-Turn RAG with Learned Sparse Retrieval and Listwise Reranking

Simon Lupart, Kidist Amde Mekonnen, Zahra Abbasiantaeb, Mohammad Aliannejadi

This paper proposes a multi-turn retrieval-augmented generation pipeline for conversational systems across four domains.

View →

cs.IRcs.AIRecentMay 30, 2026

SkillPager: Query-Adaptive Intra-Skill Navigation via Semantic Node Retrieval

Zicai Cui, Zihan Guo, Weiwen Liu, Weinan Zhang

SkillPager is a novel two-stage framework that efficiently selects minimal, execution-sufficient context from large procedural skill documents by leveraging typed semantic nodes, significantly reducin…

View →

cs.IRcs.AIcs.CLRecentMay 28, 2026

Latent Terms: Dense Retrievers Contain Trivially Extractable BM25-ready Zipfian Vocabularies

Benjamin Clavié, Sean Lee, Aamir Shakir, Makoto P. Kato

The paper introduces Latent Terms, a method that shows dense retrieval models implicitly learn sparse, Zipfian vocabularies that can be used for classical BM25-style sparse scoring without requiring s…

View →

cs.AIcs.IRRecentMay 28, 2026

Xetrieval: Mechanistically Explaining Dense Retrieval

Zhixin Cai, Jun Bai, Yang Liu, Jiaqi Li +6 more

Xetrieval introduces an embedding-level framework to mechanistically explain dense retrieval decisions by decomposing high-dimensional embeddings into sparse, human-interpretable features.

View →

cs.CVcs.AIRecentMay 29, 2026

Beyond Classification: Dynamic Adapter Routing for Continual Multimodal Retrieval

Alicja Dobrzeniecka, Filip Szatkowski, Sebastian Cygert, Szymon Lukasik +1 more

The paper proposes Dynamic Adapter Routing (DAR), a novel method that significantly improves continual multimodal retrieval by adaptively selecting and merging specialized adapters.

View →

cs.AIRecentMay 28, 2026

MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection

Haowen Wang, Yaxin Du, Jian Yang, Jiajun Wu +8 more

MIRA proposes a novel source-aware filtering framework that discovers and anchors evaluation rubrics during data selection, significantly improving code-oriented mid-training data quality while reduci…

View →

cs.AIRecentMay 28, 2026

RAISE: RAG Design as an Architecture Search Problem

Zhen Chen, Yibing Liu, Weihao Xie, Yu Liang +2 more

The paper proposes formulating RAG design as an architecture search problem and introduces RAISE, a comprehensive framework and benchmark for systematically optimizing RAG hyperparameters.

View →

cs.CRcs.IRRecentMay 19, 2026

BiRD: A Bidirectional Ranking Defense Mechanism for Retrieval Augmented Generation

Chengcai Gao, Zhihong Sun, Xiaochuan Shi, Qiufeng Wang +1 more

The paper proposes BiRD, a bidirectional ranking defense mechanism that enhances the robustness of Retrieval-Augmented Generation (RAG) against adversarial attacks by analyzing the alignment between f…

View →

cs.AIRecentJun 1, 2026

RASER: Recoverability-Aware Selective Escalation Router for Multi-Hop Question Answering

Yuyang Li, Zihe Yan, Tobias Käfer

RASER introduces a family of cheap, router-based systems that selectively decide whether to perform expensive multi-hop retrieval, significantly reducing LLM token costs while maintaining state-of-the…

View →

cs.AIcs.IRRecentMay 28, 2026

HiKEY: Hierarchical Multimodal Retrieval for Open-Domain Document Question Answering

Joongmin Shin, Gyuho Shim, Jeongbae Park, Jaehyung Seo +1 more

HiKEY proposes a hierarchical, tree-based multimodal retrieval framework that significantly improves open-domain document question answering by addressing document routing and evidence fragmentation.

View →

cs.IRcs.AIRecentMay 29, 2026

SPECTRA: Synthetic IR Test Collections with Relevance Oracles and Controlled Distractor Diagnostics

Eric Liang

The paper introduces SPECTRA, a scalable framework for generating large, synthetic, and controllable information retrieval test collections, demonstrating its ability to expose system scaling and fail…

View →

cs.IRcs.AIRecentMay 29, 2026

DynaTree: Dynamic Agentic Retrieval Tree for Time-Sensitive News Retrieval

Siyuan Qi, Xinyuan Wang, Yingxuan Yang, Haochuan Guo +4 more

DynaTree introduces a two-stage framework that pre-constructs a reusable retrieval tree offline using coordinated agents, allowing for efficient, structure-aware, and highly effective time-sensitive n…

View →

cs.CLcs.IRRecentJun 2, 2026

Re-Ranking Through an Attribution Lens for Citation Quality in Legal QA

Mohamed Hesham Elganayni, Selim Saleh

The paper introduces a cross-encoder re-ranker trained on attribution scores to improve the retrieval of highly relevant citation passages for legal question answering, outperforming standard semantic…

View →

cs.AIcs.IRRecentMay 28, 2026

Rethinking Literature Search Evaluation: Deep Research Helps, and Human Citation Lists Are Not a Ground Truth

Gaurav Sahu, Laurent Charlin, Christopher Pal

The paper introduces a Deep Research pipeline that significantly improves literature search recall and demonstrates that human-curated citation lists are often unreliable and do not serve as a true gr…

View →