"large language models" | ArxivCSExplorer

20 results for “large language models”

CS papers only

Hybrid search: Keyword + semantic, ranked by combined score.ⓘ

Want pure semantic search? Try claim verification →

cs.CVcs.AIcs.MMEmpiricalRecentJul 10, 2026

Scalable Visual Pretraining for Language Intelligence

Yiming Zhang, Zhonghan Zhao, Wenwei Zhang, Haiteng Zhao +12 more

This paper presents the benefits of visual pretraining for foundation model intelligence, outperforming text-only pretraining on multiple backbones and benchmarks.

View →

cs.CLcs.AIcs.LGRecentJun 1, 2026

Multilinguality of Large Language Models From a Structural Perspective

Haruki Sakajo, Yusuke Sakai, Hidetaka Kamigaito, Taro Watanabe

This paper analyzes the multilinguality of LLMs by examining their structural properties, finding that low-resource languages are structurally more distinct from English than high-resource languages,…

View →

cs.CLcs.AIcs.LGRecentMay 27, 2026

Extracting Small Translation Specialists from LLMs by Aggressively Pruning Experts

Liu O. Martin, Lucas Bandarkar, Nanyun Peng

The paper proposes an aggressive, parameter-efficient method to prune non-essential experts from Mixture-of-Experts (MoE) LLMs, significantly compressing the model while maintaining high machine trans…

View →

cs.DLcs.AIcs.CLSurveyRecentJul 29, 2026

Scientific Knowledge Discovery in the Age of Large Language Models

Eleni Adamidi, Serafeim Chatzopoulos, Thanasis Vergoulis

This paper surveys 34 peer-reviewed studies applying generative large language models to literature retrieval and eligibility screening.

View →

cs.CLcs.AIcs.LGEmpiricalRecentJun 11, 2026

SkMTEB: Slovak Massive Text Embedding Benchmark and Model Adaptation

Marek Šuppa, Andrej Ridzik, Daniel Hládek, Natália Kňažeková +1 more

This paper introduces SkMTEB, a comprehensive text embedding benchmark for Slovak, and develops efficient, locally-deployable Slovak embeddings.

View →

cs.CLEmpiricalRecentJul 24, 2026

A Factorial Study of Synthetic Data Generation for Low-Resource Machine Translation using Grammar Books

Varun Ghat Ravikumar, Sina Ahmadi, Lena Jäger, Rico Sennrich

This paper introduces a pipeline to extract grammatical rules, example sentences, and lexicons from grammar books and generates synthetic parallel corpora for fine-tuning machine translation models on…

View →

cs.LGcs.AIRecentMay 31, 2026

When Data Is Scarce: Scaling Sparse Language Models with Repeated Training

Boqian Wu, Qiao Xiao, Patrik Okanovic, Tomasz Sternal +5 more

This paper introduces a new scaling law for sparse language models trained with limited data, demonstrating that sparsity can significantly improve performance and delay data saturation during multi-e…

View →

cs.CLcs.AIRecentMay 30, 2026

EPIC: Efficient and Parallel Inference under CFG Constraints for Diffusion Language Models

Hyundong Jin, Yo-Sub Han

The paper proposes EPIC, an efficient and parallel decoding framework that significantly speeds up the process of constraining diffusion language model outputs using Context-Free Grammars (CFG).

View →

cs.CLEmpiricalRecentJun 22, 2026

Randomized YaRN Improves Length Generalization for Long-Context Reasoning

Manas Mehta, Fangcong Yin, Greg Durrett

The paper proposes Randomized YaRN, a training method that improves length generalization in large language models by exposing them to out-of-distribution positional representations during training on…

View →

cs.CLcs.AIcs.LGPositionRecentJun 26, 2026

From Tokens to States: LLMs as a Special Case of World Models and the Continuous Path Beyond

Paul Dubois

The paper argues that large language models (LLMs) are a special case of world models and proposes a continuous spectrum between token prediction and latent-space architectures.

View →

cs.CLcs.AIRecentJun 1, 2026

Multilingual Idioms in Sentences and Conversations Across High-, Medium-, and Low-Resource Languages

Saeed Almheiri, Bilal Elbouardi, Salsabila Zahirah Pranida, Irina Nikishina +15 more

The paper introduces MIDI, a novel multilingual dataset that embeds idioms in realistic sentence and conversational contexts across diverse resource levels, revealing that idiom comprehension is signi…

View →

cs.CLRecentJun 1, 2026

PortBERT: Navigating the Depths of Portuguese Language Models

Raphael Scheible-Schmitt, Henry He, Armando B. Mendes

The paper introduces PortBERT, a family of RoBERTa-based language models for Portuguese, which achieves competitive performance while explicitly balancing efficiency and accuracy.

View →

cs.CLEmpiricalRecentJul 17, 2026

Rate-Utility Frontiers for Language Encodings: Comparing Tokens, Bytes, and Pixels Under Controlled Linguistic Content

Ingo Ziegler, Martin Krebs, Desmond Elliott

This paper compares the preservation of linguistic content in different text encodings (tokens, bytes, pixels) using a shared bottleneck, revealing their distinct strengths in surface form preservatio…

View →

cs.CLcs.AIcs.LGRecentJun 1, 2026

ProbeScale: Probing Analysis to Optimize Neural Scaling Laws for Efficient Small Language Model Inference

Sourav Das

ProbScale is a novel framework that combines neural scaling laws and language model probing to identify highly efficient, task-specific subnetworks within pre-trained Small Language Models, achieving…

View →

cs.CRcs.AIRecentMay 7, 2026

Fine-Tuning Small Language Models for Solution-Oriented Windows Event Log Analysis

Siraaj Akhtar, Saad Khan, Simon Parkinson

This paper demonstrates that fine-tuning small language models (SLMs) on a synthetic, solution-rich Windows event log dataset allows them to outperform larger LLMs in identifying issues and providing…

View →

cs.LGcs.AIRecentMay 27, 2026

Efficient Pre-Training of LLMs through Truncated SVD Layers

Kaivan Kamali, Kajetan Schweighofer, Hormoz Shahrzad, Olivier Francon +2 more

The paper introduces TSVD, a novel framework that efficiently pre-trains LLMs by enforcing both low rank and strict weight orthonormality, achieving performance comparable to full-parameter models wit…

View →

cs.CLEmpiricalRecentJul 2, 2026

BamiBERT: A New BERT-based Language Model for Vietnamese

Dat Quoc Nguyen, Thinh Pham, Chi Tran, Linh The Nguyen

This paper introduces BamiBERT, a new Vietnamese language model based on BERT that addresses limitations of PhoBERT and sets a new state-of-the-art among base-sized Vietnamese encoders.

View →

cs.CLcs.AIcs.LGRecentMay 27, 2026

Pruning and Distilling Mixture-of-Experts into Dense Language Models

Junhyuck Kim, Jihun Yun, Haechan Kim, Gyeongman Kim +2 more

The paper introduces a systematic framework to convert large Mixture-of-Experts (MoE) models into memory-efficient, fully dense architectures, achieving superior performance compared to traditional pr…

View →