20 results for “multilingual models”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
This study systematically analyzes strategies for creating reliable multilingual LLMs-as-a-judge, finding that fine-tuning smaller models with in-domain data is effective, while zero-shot evaluation w…
Marek Šuppa, Andrej Ridzik, Daniel Hládek, Natália Kňažeková +1 more
This paper introduces SkMTEB, a comprehensive text embedding benchmark for Slovak, and develops efficient, locally-deployable Slovak embeddings.
This paper demonstrates that multilingual code-switching instruction tuning, involving four languages (English, Japanese, Korean, and Chinese), significantly improves average multilingual performance,…
The paper introduces a diagnostic framework to decompose multilingual LLM performance variance, showing that language identity and model-benchmark interactions are key drivers of performance gaps.
The paper introduces MIDI, a novel multilingual dataset that embeds idioms in realistic sentence and conversational contexts across diverse resource levels, revealing that idiom comprehension is signi…
The paper develops a theoretically grounded framework for evaluating multilingual LLMs in Social Sciences and Humanities, moving beyond traditional NLP benchmarks to assess interpretive validity and c…
Yunhan Zhao, Zhaorun Chen, Xingjun Ma, Yu-Gang Jiang +1 more
The paper introduces ML-Bench, a policy-grounded multilingual safety benchmark, and ML-Guard, a superior guardrail model that enables culturally and legally aligned safety assessment for LLMs across 1…
The paper proposes decomposing the assessment of massive multilingual parallel data into separate parallelism and quality estimation components, concluding that no single universal metric is reliable…
The paper proposes an unsupervised Reinforcement Learning approach that enforces cross-lingual self-consistency to significantly enhance the multilingual reasoning capabilities of large language model…
The paper proposes MIMO, a two-stage framework that improves Multilingual Information Retrieval (MLIR) by stabilizing cross-lingual alignment and enhancing retrieval discrimination using a combination…
This pilot study evaluates curator-guided multilingual art description using a small, on-premise VLM (Qwen2.5-VL-3B-Instruct) for German, Romanian, and Serbian, finding that language-specific adapters…
Sarmistha Das, Vaibhav Vishal, Shreyas Guha, Amaan Ali +2 more
This paper introduces a Hybrid Mixture-of-Experts (HybridMoE) framework and a specialized corpus (Varnika) to significantly improve language models' ability to understand and retain figurative, cultur…
This paper analyzes the multilinguality of LLMs by examining their structural properties, finding that low-resource languages are structurally more distinct from English than high-resource languages,…
The paper introduces XLGoBench, a synthetic benchmark of algorithmic tasks designed to detect persistent cross-lingual skill gaps in large language models.
This paper introduces robustness indicators to systematically analyze how multilingual text embedding model rankings change based on dataset composition and aggregation methods, revealing that only a…
The paper proposes an aggressive, parameter-efficient method to prune non-essential experts from Mixture-of-Experts (MoE) LLMs, significantly compressing the model while maintaining high machine trans…
The paper introduces PortBERT, a family of RoBERTa-based language models for Portuguese, which achieves competitive performance while explicitly balancing efficiency and accuracy.
Guanzhi Deng, Kuan Wu, Haibo Wang, Shing Yin Wong +2 more
The paper introduces RA-MoE, a novel fine-tuning framework that leverages the internal routing structure of Mixture-of-Experts (MoE) models to improve performance on multilingual downstream tasks by a…
The paper introduces Multi-Legal-Bench, a novel cross-jurisdictional benchmark evaluating LLMs on five standardized legal reasoning tasks across six diverse countries, demonstrating that cross-lingual…
This paper details the systematic construction and training of a high-performing Romanian Vision-Language Model (VLM), demonstrating that language-specific adaptation significantly boosts performance…