"Knowledge of large language models"

20 results for “Knowledge of large language models”

CS papers only

Hybrid search: Keyword + semantic, ranked by combined score.ⓘ

Want pure semantic search? Try claim verification →

stat.OTcs.AIEmpiricalRecentJun 9, 2026

Flaws in the LLM Automation Narrative

George Perrett, Javae Elliott, Jennifer Hill, Marc Scott

This paper evaluates the performance of a Large Language Model (LLM) in a high-stakes context by comparing it to human experts and measuring variance and error magnitude.

View →

cs.CLcs.AIRecentMay 30, 2026

Revisiting Parameter-Based Knowledge Editing in Large Language Models: Theoretical Limits and Empirical Evidence

Wanying Ren, Xin Song, Futing Wang, Guoxiu He +1 more

The paper theoretically analyzes the limitations of parameter-based knowledge editing and empirically demonstrates that these methods consistently damage core LLM capabilities compared to retrieval-ba…

View →

cs.CLRecentMay 29, 2026

MoG: Mixture of Experts for Graph-based Retrieval-Augmented Generation

Zheng Yuan, Chuang Zhou, Linhao Luo, Siyu An +3 more

MoG proposes a novel Mixture of Experts framework for graph-based RAG, which uses hub graphs to guide the sparse activation of domain-specific expert graphs, significantly improving retrieval accuracy…

View →

cs.CLcs.AIcs.LGRecentMay 27, 2026

Extracting Small Translation Specialists from LLMs by Aggressively Pruning Experts

Liu O. Martin, Lucas Bandarkar, Nanyun Peng

The paper proposes an aggressive, parameter-efficient method to prune non-essential experts from Mixture-of-Experts (MoE) LLMs, significantly compressing the model while maintaining high machine trans…

View →

cs.CLcs.AIcs.LGRecentJun 1, 2026

ProbeScale: Probing Analysis to Optimize Neural Scaling Laws for Efficient Small Language Model Inference

Sourav Das

ProbScale is a novel framework that combines neural scaling laws and language model probing to identify highly efficient, task-specific subnetworks within pre-trained Small Language Models, achieving…

View →

cs.CLcs.AIcs.LGRecentMay 27, 2026

Pruning and Distilling Mixture-of-Experts into Dense Language Models

Junhyuck Kim, Jihun Yun, Haechan Kim, Gyeongman Kim +2 more

The paper introduces a systematic framework to convert large Mixture-of-Experts (MoE) models into memory-efficient, fully dense architectures, achieving superior performance compared to traditional pr…

View →

cs.AIRecentMay 31, 2026

DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts

Jiarui Feng, Hanqing Zeng, Karish Grover, Ruizhong Qiu +10 more

The paper proposes DAG-MoE, a novel sparse Mixture-of-Experts framework that replaces standard weighted-sum aggregation with structural aggregation to enhance model performance and enable multi-step r…

View →

cs.CLcs.AIRecentMay 30, 2026

EPIC: Efficient and Parallel Inference under CFG Constraints for Diffusion Language Models

Hyundong Jin, Yo-Sub Han

The paper proposes EPIC, an efficient and parallel decoding framework that significantly speeds up the process of constraining diffusion language model outputs using Context-Free Grammars (CFG).

View →

cs.CRcs.AIRecentMay 7, 2026

Fine-Tuning Small Language Models for Solution-Oriented Windows Event Log Analysis

Siraaj Akhtar, Saad Khan, Simon Parkinson

This paper demonstrates that fine-tuning small language models (SLMs) on a synthetic, solution-rich Windows event log dataset allows them to outperform larger LLMs in identifying issues and providing…

View →

eess.AScs.AIcs.SDRecentMay 29, 2026

A Unified and Reproducible Experimentation Framework for Speech Understanding

Jing Peng, Junhao Du, Chenghao Wang, Hanqi Li +20 more

The paper introduces SURE, a unified framework designed to standardize and improve the comparability and reproducibility of evaluations for advanced speech understanding models.

View →

cs.LGcs.AIRecentMay 27, 2026

Learning the Error Patterns of Language Models

Jinwoo Kim, Taylor Berg-KirkPatrick, Loris D'Antoni

The paper introduces prefix filters and an algorithm (Palla) to systematically learn and apply specific error patterns in Large Language Models, significantly improving constrained generation tasks li…

View →

cs.CLcs.AIcs.LGEmpiricalRecentJun 11, 2026

SkMTEB: Slovak Massive Text Embedding Benchmark and Model Adaptation

Marek Šuppa, Andrej Ridzik, Daniel Hládek, Natália Kňažeková +1 more

This paper introduces SkMTEB, a comprehensive text embedding benchmark for Slovak, and develops efficient, locally-deployable Slovak embeddings.

View →

cs.CRcs.CLcs.LGRecentMar 25, 2026

How Vulnerable Are Edge LLMs?

Ao Ding, Hongzong Li, Zi Liang, Zhanpeng Shi +4 more

The paper investigates the security risk of extracting knowledge from quantized LLMs deployed on edge devices, showing that structured querying can effectively bypass quantization protections.

View →

cs.AIcs.CLRecentMay 28, 2026

Demystifying Data Organization for Enhanced LLM Training

Yalun Dai, Yangyu Huang, Tongshen Yang, Yonghan Wang +7 more

This paper proposes four guidelines and two novel data ordering methods (STR and SAW) to systematically optimize data organization, significantly enhancing the stability and performance of LLM trainin…

View →

cs.CLcs.AIRecentJun 1, 2026

Multilingual Idioms in Sentences and Conversations Across High-, Medium-, and Low-Resource Languages

Saeed Almheiri, Bilal Elbouardi, Salsabila Zahirah Pranida, Irina Nikishina +15 more

The paper introduces MIDI, a novel multilingual dataset that embeds idioms in realistic sentence and conversational contexts across diverse resource levels, revealing that idiom comprehension is signi…

View →

cs.CLRecentMay 30, 2026

OCC-RAG: Optimal Cognitive Core for Faithful Question Answering

Maksim Savkin, Mikhail Goncharov, Alexander Gambashidze, Alla Chepurova +6 more

The paper introduces OCC-RAG, a family of compact, task-specialized Small Language Models (SLMs) designed to achieve highly faithful, multi-hop question answering grounded strictly in provided context…

View →

cs.CLcs.AIcs.DSRecentMay 29, 2026

Neuro-symbolic Syntactic Parsing: Shaping a Neural Network with the CYK Algorithm

Fabio Massimo Zanzotto, Federico Ranaldi, Giorgio Satta

The paper proposes CYKNN, a novel recurrent neural network architecture that directly encodes the CYK parsing algorithm, demonstrating superior performance over large language models on syntactic pars…

View →

cs.CLcs.LGRecentMay 29, 2026

Scaling Multi-Hop Training Data via Graph-Constrained Path Selection

Pengyu Chen, Yonggang Zhang, Mingming Chen, Jun Song +2 more

The paper proposes a graph-constrained approach to scale multi-hop training data by decoupling path discovery from path verbalization, significantly expanding the usable corpus size for LLMs.

View →

cs.DScs.AIcs.CLRecentMay 28, 2026

On Language Generation in the Limit with Bounded Memory

Jon Kleinberg, Anay Mehrotra, Amin Saberi, Grigoris Velegkas

The paper analyzes language generation and identification in the limit under bounded memory, showing that memory constraints significantly alter learnability, particularly affecting achievable density…

View →

cs.CLRecentMay 29, 2026

Parameter Alignment Mitigates Catastrophic Forgetting in Multilingual Expert Language Models

Sanchit Ahuja, Terra Blevins

The paper introduces and evaluates five parameter alignment strategies that significantly mitigate catastrophic forgetting when continually pretraining multilingual expert language models across multi…

View →