~ similar to 2606.01697· 19 results
This paper proposes a multi-turn retrieval-augmented generation pipeline for conversational systems across four domains.
The paper proposes InSemRAG, an enhanced RAG framework that improves retrieval accuracy and knowledge integrity by incorporating intent-aware retrieval and semantics-preserving chunking, achieving sta…
Zhixin Cai, Jun Bai, Yang Liu, Jiaqi Li +6 more
Xetrieval introduces an embedding-level framework to mechanistically explain dense retrieval decisions by decomposing high-dimensional embeddings into sparse, human-interpretable features.
Alireza Salemi, Chang Zeng, Atharva Nijasure, Jui-Hui Chung +3 more
GrepSeek introduces a novel direct corpus interaction (DCI) search agent that trains an LLM to find and compose evidence from large text corpora by issuing executable shell commands, achieving state-o…
The paper systematically compares multiple content representations for RAG pipelines and finds that answer retention—the ability of the representation to preserve the original answer-bearing content—i…
SkillPager is a novel two-stage framework that efficiently selects minimal, execution-sufficient context from large procedural skill documents by leveraging typed semantic nodes, significantly reducin…
Critic-R introduces a novel framework that uses a critic model to provide natural language introspective feedback, significantly improving the performance of agentic search systems by optimizing retri…
Xuan Lu, Haohang Huang, Yingqi Fan, Junlong Tong +4 more
This paper proposes CompRank, a token-efficient reranking framework for large language models that reduces redundant computation and achieves strong reranking performance.
Zhipeng Qian, Zihan Liang, Yufei Ma, Ben Chen +6 more
The paper introduces Plan, a structured agentic behavior that decomposes multi-hop questions into ordered sub-questions before retrieval, and proposes a self-bootstrapping paradigm to train it without…
CoHyDE introduces an iterative co-training framework that jointly optimizes an LLM rewriter and a dense encoder, significantly improving tool retrieval accuracy for LLM agents, especially on vague que…
Han Zhang, Zihao Tang, Xin Yu, Xiao Liu +7 more
The paper introduces RHELM, a new benchmark designed to test LLMs' long-term memory by simulating realistic, complex, and evolving dialogues that integrate multiple heterogeneous data sources.
Zilin Xiao, Qi Ma, Chun-cheng Jason Chen, Xintao Chen +3 more
This paper proposes a post-training framework called Retrieval-Augmented Reinforcement Fine-Tuning (RA-RFT) to teach language models to reason by analogy.
Tao Feng, Tianyang Luo, Jingjun Xu, Zhigang Hua +4 more
ExpWeaver introduces a novel framework for LLM agents to learn from past experiences using latent retrieval-augmented generation, achieving state-of-the-art performance while significantly improving t…
This study systematically evaluates a wide range of chunking methods for Retrieval-Augmented Generation (RAG) to assess their effectiveness and highlight the overlooked challenges associated with chun…
The paper proposes MIMO, a two-stage framework that improves Multilingual Information Retrieval (MLIR) by stabilizing cross-lingual alignment and enhancing retrieval discrimination using a combination…
The paper proposes DART, a test-time adaptation method that enhances zero-resource dense retrieval reranking by adaptively tuning a bilinear scoring matrix using pseudo-positive and pseudo-negative ex…
SilentRetrieval introduces a sophisticated, two-stage data poisoning attack that successfully hijacks Retrieval-Augmented Generation (RAG) systems by injecting adversarially crafted, yet highly fluent…
Ziyu Song, Jiaming Fang, Kuangyu Li, Tuo Xia +1 more
This paper proposes Tail-Aware Adaptive-k (TAA-k), a training-free framework for adaptive context selection in retrieval-augmented generation systems using Extreme Value Theory.
Daniel Arnould, Rashad Aziz, Zixuan Kang, Tanav Changal +4 more
CA-BED is a novel framework that improves LLM performance in interactive question-answering by integrating Bayesian Experimental Design to strategically select questions that maximize information gain…