~ similar to 2605.28642· 17 results
Sicheng Yang, Shulan Ruan, Shiwei Wu, Yu Liu +3 more
PolySpeech-100 introduces a massive, multi-lingual benchmark covering 110 linguistic variants to rigorously test Speech-LLMs, demonstrating that open-source models struggle with low-resource languages…
The paper benchmarks local, offline LLMs for confidential translation workflows, demonstrating that while they are viable for privacy-sensitive use, they generally lag behind top commercial NMT system…
Zihan Liu, Yizhen Wang, Rui Wang, Xiu Tang +1 more
This survey provides a comprehensive, structured taxonomy of split learning techniques for fine-tuning Large Language Models (LLMs), covering model optimization, system efficiency, and privacy preserv…
Ao Ding, Hongzong Li, Zi Liang, Zhanpeng Shi +4 more
The paper investigates the security risk of extracting knowledge from quantized LLMs deployed on edge devices, showing that structured querying can effectively bypass quantization protections.
Yanjie An, Yuxiang Zhao, Yichi Zhang, Qixi Zheng +4 more
The paper introduces OpenSTBench, a unified, multidimensional evaluation framework designed to comprehensively compare heterogeneous speech translation systems by jointly assessing translation, speech…
Haechan Kim, Seungjun Chung, Inkyu Park, Jihoo Lee +1 more
The paper introduces three new Korean speech benchmarks (KVoiceBench, KOpenAudioBench, and KMMAU) to evaluate SpeechLMs, demonstrating that English-centric evaluation fails to capture performance gaps…
SafeLM is a comprehensive framework that jointly addresses privacy, security, misinformation, and adversarial robustness in federated LLMs, achieving high safety performance while significantly reduci…
Jeongho Yoon, Chanhee Park, Yongchan Chun, Hyeonseok Moon +1 more
The paper introduces Privacy-Preserving Fine-Tuning (PPFT), a novel two-stage pipeline that allows LLMs to process sensitive data via pooled embeddings rather than raw text, achieving a strong balance…
Cong Kong, Xin Cheng, Zhaoxia Yin, Shuai Li +2 more
VertMark introduces a novel, unified, and training-free framework to embed robust watermarks into vertical domain pre-trained language models (VPLMs) for copyright protection across multiple specializ…
FedSpy-LLM introduces a scalable and generalizable data reconstruction attack that can extract private training data from shared gradients of large language models, even when using Parameter-Efficient…
Peihua Mai, Xuanrong Gao, Youlong Ding, Xianglong Du +2 more
SharedRequest introduces a model-agnostic framework that enhances LLM privacy and efficiency by batching and mixing prompts with noisy variants, achieving high utility and significant cost reduction.
Gangmuk Lim, Wanyu Zhao, Brighten Godfrey, Jiaxin Shan +2 more
Lodestar is a novel online learning-based request routing system that significantly improves LLM inference efficiency by dynamically assigning incoming requests to the optimal GPU instance to minimize…
Guanlong Wu, Zhaohan li, Yao Zhang, Zheng Zhang +3 more
CachePrune introduces a privacy-aware, fine-grained KV cache sharing mechanism that allows LLM inference systems to safely reuse cache entries across users' requests, significantly improving efficienc…
Murmur is an efficient inference system for long-form ASR that resolves the accuracy-latency trade-off by optimizing both inter-chunk processing and intra-chunk attention mechanisms.
The paper proposes a novel cross-modal backdoor attack that exploits the vulnerability of lightweight connectors in multimodal LLMs, demonstrating high attack success rates across different modalities…
Echelon introduces a boundary-first training architecture that enables cross-organization language-model adaptation while strictly enforcing device-level model state non-export, achieving strong perfo…
Divergence Decoding (DD) is a novel, effective, and inexpensive method that uses auxiliary models to steer LLM logits during inference, enabling the removal of memorized sensitive data without signifi…