~ similar to 2605.29384· 20 results
Lixuan Guo, Yifei Wang, Tiansheng Wen, Aosong Feng +2 more
The paper introduces Single-stage Sparse Retrieval (SSR), a method that replaces computationally expensive vector clustering with sparse autoencoding to achieve highly efficient multi-vector retrieval…
The paper proposes DART, a test-time adaptation method that enhances zero-resource dense retrieval reranking by adaptively tuning a bilinear scoring matrix using pseudo-positive and pseudo-negative ex…
Zhixin Cai, Jun Bai, Yang Liu, Jiaqi Li +6 more
Xetrieval introduces an embedding-level framework to mechanistically explain dense retrieval decisions by decomposing high-dimensional embeddings into sparse, human-interpretable features.
This paper proposes a multi-turn retrieval-augmented generation pipeline for conversational systems across four domains.
Boqian Wu, Qiao Xiao, Patrik Okanovic, Tomasz Sternal +5 more
This paper introduces a new scaling law for sparse language models trained with limited data, demonstrating that sparsity can significantly improve performance and delay data saturation during multi-e…
Xuan Lu, Haohang Huang, Yingqi Fan, Junlong Tong +4 more
This paper proposes CompRank, a token-efficient reranking framework for large language models that reduces redundant computation and achieves strong reranking performance.
Paul Jünger, Justin Lovelace, Linxi Zhao, Dongyoung Go +1 more
The paper introduces SARDI, a novel, training-free framework that uses low-confidence 'lookahead' tokens generated during the denoising process of discrete diffusion language models to dynamically gui…
Tao Feng, Chongrui Ye, Tianyang Luo, Jingjun Xu +4 more
ElasticMem introduces a novel framework that treats memory as an elastic latent resource, allowing LLM agents to adaptively manage and inject variable-budget memories for improved performance in long-…
The paper theoretically analyzes the properties that optimal sparse autoencoder (SAE) dictionaries must satisfy, deriving constraints that explain observed SAE behaviors like hierarchical splitting an…
CoHyDE introduces an iterative co-training framework that jointly optimizes an LLM rewriter and a dense encoder, significantly improving tool retrieval accuracy for LLM agents, especially on vague que…
The paper introduces SPECTRA, a scalable framework for generating large, synthetic, and controllable information retrieval test collections, demonstrating its ability to expose system scaling and fail…
Vincent-Daniel Yun, Youngrae Kim, Woosang Lim, YoungJin Heo +2 more
The paper proposes Locality-Aware Redundancy Pruning (LoRP), a training-free method that prunes LLM layers by exploiting localized inter-layer redundancy, leading to improved efficiency while maintain…
The paper proposes MIMO, a two-stage framework that improves Multilingual Information Retrieval (MLIR) by stabilizing cross-lingual alignment and enhancing retrieval discrimination using a combination…
The paper proposes DINOSAUR, a framework that incorporates embedding uncertainty into Approximate Nearest Neighbour search to improve retrieval for niche, long-tail content.
The paper introduces a distributional framework using Wasserstein distance to unify the semantic comparison of sparse autoencoder features across different layers and to automatically compress large f…
SPARQLe is a hardware-software co-design framework that exploits the inherent sub-precision sparsity of LLM activations to reduce memory traffic and enable efficient computation on lower-bit datapaths…
Adly Templeton, Tom Conerly, Jonathan Marcus, Jack Lindsey +22 more
The paper demonstrates that sparse autoencoders can successfully extract a large set of interpretable, causally influential features from the production-scale Claude 3 Sonnet language model.
The paper introduces the Vector Network (VN), a novel recurrent architecture that replaces fixed weight matrices with reusable weight atoms, enabling superior compositional generalization by making st…
The paper proposes InSemRAG, an enhanced RAG framework that improves retrieval accuracy and knowledge integrity by incorporating intent-aware retrieval and semantics-preserving chunking, achieving sta…
The paper introduces Entity-Collision, a rigorous protocol that separates genuine retrieval lift from simple lexical overlap, demonstrating that embedder performance depends critically on the query ty…