20 results for “Knowledge of large language models and reranking”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
This paper evaluates the performance of a Large Language Model (LLM) in a high-stakes context by comparing it to human experts and measuring variance and error magnitude.
The paper proposes an aggressive, parameter-efficient method to prune non-essential experts from Mixture-of-Experts (MoE) LLMs, significantly compressing the model while maintaining high machine trans…
Xuan Lu, Haohang Huang, Yingqi Fan, Junlong Tong +4 more
This paper proposes CompRank, a token-efficient reranking framework for large language models that reduces redundant computation and achieves strong reranking performance.
This paper proposes a multi-turn retrieval-augmented generation pipeline for conversational systems across four domains.
Yalun Dai, Yangyu Huang, Tongshen Yang, Yonghan Wang +7 more
This paper proposes four guidelines and two novel data ordering methods (STR and SAW) to systematically optimize data organization, significantly enhancing the stability and performance of LLM trainin…
Jiarui Feng, Hanqing Zeng, Karish Grover, Ruizhong Qiu +10 more
The paper proposes DAG-MoE, a novel sparse Mixture-of-Experts framework that replaces standard weighted-sum aggregation with structural aggregation to enhance model performance and enable multi-step r…
Zheng Yuan, Chuang Zhou, Linhao Luo, Siyu An +3 more
MoG proposes a novel Mixture of Experts framework for graph-based RAG, which uses hub graphs to guide the sparse activation of domain-specific expert graphs, significantly improving retrieval accuracy…
Wanying Ren, Xin Song, Futing Wang, Guoxiu He +1 more
The paper theoretically analyzes the limitations of parameter-based knowledge editing and empirically demonstrates that these methods consistently damage core LLM capabilities compared to retrieval-ba…
The paper proposes DART, a test-time adaptation method that enhances zero-resource dense retrieval reranking by adaptively tuning a bilinear scoring matrix using pseudo-positive and pseudo-negative ex…
Junhyuck Kim, Jihun Yun, Haechan Kim, Gyeongman Kim +2 more
The paper introduces a systematic framework to convert large Mixture-of-Experts (MoE) models into memory-efficient, fully dense architectures, achieving superior performance compared to traditional pr…
Marek Šuppa, Andrej Ridzik, Daniel Hládek, Natália Kňažeková +1 more
This paper introduces SkMTEB, a comprehensive text embedding benchmark for Slovak, and develops efficient, locally-deployable Slovak embeddings.
Pengyu Chen, Yonggang Zhang, Mingming Chen, Jun Song +2 more
The paper proposes a graph-constrained approach to scale multi-hop training data by decoupling path discovery from path verbalization, significantly expanding the usable corpus size for LLMs.
This paper proposes a lightweight encoder-based MEL solution called FAST-MEL that meets three objectives: high linking accuracy, computational efficiency, and storage efficiency.
Yilun Yao, Jiaming Pan, Elsie Dai, Peizhuang Cong +2 more
ConMoE proposes a train-free method for compressing Mixture-of-Experts (MoE) models by consolidating the large expert pool into a smaller set of reusable prototypes and deterministically remapping all…
The paper introduces RAG-Pref, a novel, training-free Retrieval Augmented Generation (RAG) method for preference alignment that significantly improves LLM refusal guardrails against agentic attacks wi…
This study systematically evaluates a wide range of chunking methods for Retrieval-Augmented Generation (RAG) to assess their effectiveness and highlight the overlooked challenges associated with chun…
PrunePath introduces a budget-adaptive structured sparsification framework that efficiently prunes Feed-forward networks in large language models, achieving hardware-friendly sparsity and measurable s…
ProbScale is a novel framework that combines neural scaling laws and language model probing to identify highly efficient, task-specific subnetworks within pre-trained Small Language Models, achieving…
The paper proposes EPIC, an efficient and parallel decoding framework that significantly speeds up the process of constraining diffusion language model outputs using Context-Free Grammars (CFG).
The paper introduces TSVD, a novel framework that efficiently pre-trains LLMs by enforcing both low rank and strict weight orthonormality, achieving performance comparable to full-parameter models wit…