cs.IREmpirical

Tail-Aware Adaptive-k: Query-Adaptive Context Selection for Retrieval-Augmented Generation

Ziyu Song, Jiaming Fang, Kuangyu Li, Tuo Xia, Chuanpeng Wang

Jun 10, 2026

AI Summaryllama-3.1-8b-instruct

This paper proposes Tail-Aware Adaptive-k (TAA-k), a training-free framework for adaptive context selection in retrieval-augmented generation systems using Extreme Value Theory.

TAA-k is the first to operationalize EVT through a localized validation strategy, reducing computational complexity and maintaining statistical rigor.

Keywords

retrieval-augmented generation Extreme Value Theory adaptive context selection ranked lists

Before reading this…

Understanding of Extreme Value Theory Retrieval-augmented generation systems

Applications

→Information retrieval
→Question answering systems

Skill Ladder

To understand this paper, make sure you know these concepts first:

Understanding of Extreme Value Theoryfind papers →
Retrieval-augmented generation systemsfind papers →

Abstract

More Like This

Adaptive context selection is critical for retrieval-augmented generation (RAG) systems, as fixed Top-K retrieval fails under query-dependent and heavy-tailed similarity distributions. While Extreme Value Theory (EVT) offers a principled framework for adaptive truncation, existing approaches apply EVT globally across the entire ranked list, incurring prohibitive computational costs and statistical instability. We propose Tail-Aware Adaptive-k(TAA-k), a training-free framework that operationalizes EVT through a localized validation strategy. The key insight is that ranked similarity curves exhibit a characteristic steep--flat--steep pattern reflecting a transition from relevance-dominated to noise-dominated regimes. TAA-k exploits this geometric structure via knee detection to identify a compact candidate region, then applies EVT-based goodness-of-fit testing within this window to validate the onset of tail behavior. This coarse-to-fine design reduces computational complexity from O(N^2M) to O(sqrt{N\log N}*M) while maintaining statistical rigor. Under mild monotone likelihood ratio assumptions, TAA-k yields a stable, query-adaptive cutoff corresponding to the earliest noise-dominated position. Experiments on WebQuestions, 2WikiMultiHopQA, and MuSiQue demonstrate that TAA-k achieves near-oracle retrieval quality (F1 within 2-3% of oracle) with orders-of-magnitude efficiency gains over global EVT methods, while maintaining robustness across embedding models and compression dimensions.