~ similar to 2605.30385· 20 results
Xinyang Lu, Jiabao Pan, Rachael Hwee Ling Sim, See-Kiong Ng +2 more
The paper proposes DareU, a novel LLM unlearning framework that optimizes unlearning by zeroing out data attribution scores instead of maximizing prediction loss, achieving effective unlearning while…
Yue Zhao, Yujia Gong, Ruigang Liang, Shenchen Zhu +3 more
The paper introduces Cross-Model Neuron Transfer (CNT), a post-hoc method that efficiently transfers safety-oriented functionalities between different large language models by transferring minimal sub…
The paper proposes a novel bi-level exact unlearning attack targeting Large Reasoning Models (LRMs) that forces incorrect final answers while generating misleading reasoning traces, highlighting new s…
This paper proposes using offline reinforcement learning (RL) as an efficient alternative to online RL for post-training code-generating LLMs, demonstrating its effectiveness, especially for smaller m…
Yalun Dai, Yangyu Huang, Tongshen Yang, Yonghan Wang +7 more
This paper proposes four guidelines and two novel data ordering methods (STR and SAW) to systematically optimize data organization, significantly enhancing the stability and performance of LLM trainin…
The paper challenges the conclusion that LLMs lack reasoning by demonstrating that reported performance drops on GSM-Symbolic are often statistically weak and partially attributable to dataset biases,…
Senmiao Wang, Tiantian Fang, Haoran Zhang, Yushun Zhang +3 more
This paper proposes a preconditioning layer for stable weight conditioning in LLM training.
Senmiao Wang, Tiantian Fang, Haoran Zhang, Yushun Zhang +3 more
This paper proposes a preconditioning layer for stable weight conditioning in LLM training.
The paper proposes two novel multi-column RBFN architectures, MC-PSO and MC-APSO, that combine parallel RBFN structures with swarm optimization to significantly outperform existing methods in accuracy…
The paper proposes SubFit, a novel compression technique that achieves superior LLM compression by replacing non-contiguous, submodule-level components (Attention and FeedForward) with lightweight res…
The paper introduces Chunk-Level Guided Generation, a training-free method that uses an off-the-shelf large language model (LLM) as a process scorer to guide small model generation, achieving performa…
This paper demonstrates that LLM cascade systems, designed for efficiency, are vulnerable to targeted adversarial attacks that simultaneously degrade both performance and cost-efficiency.
Zihan Liu, Yizhen Wang, Rui Wang, Xiu Tang +1 more
This survey provides a comprehensive, structured taxonomy of split learning techniques for fine-tuning Large Language Models (LLMs), covering model optimization, system efficiency, and privacy preserv…
The paper introduces Synthesis Data Reversion (SDR), a method that infers the data laundering transformation used in LLM training and synthesizes queries to restore the detection signals lost when pro…
The paper proposes using fine-grained quality signals, such as pairwise self-judgments and token-level entropy, instead of simple binary correctness to improve LLM performance on saturated datasets, s…
Zhenghua Bao, Fengya Tian, Chris Zhang, Zhenjun Chen +2 more
OrcaRouter is a production-ready LLM router that uses a hybrid offline-online learning approach to efficiently select the best large language model for an incoming query, achieving high accuracy at lo…
Zizhe Chen, Jiqian Dong, Yizhou Tian, Garry Yang +3 more
This paper introduces Numca and Hista, two novel techniques that significantly improve state value estimation for LLM reinforcement learning, addressing the instability of standard critic approaches.
Pei Chen, Geng Hong, Xinyi Wu, Mengying Wu +5 more
This paper systematically analyzes the resilience of LLM-enhanced search engines against black-hat SEO attacks, finding that while they block most traditional attacks, they remain vulnerable to sophis…
This study empirically benchmarks classical and quantum machine learning models for image recognition, finding that while quantum models offer superior accuracy and resource efficiency at high dimensi…
NANOZK introduces a novel, highly efficient zero-knowledge proof system that allows users to cryptographically verify that the output of a large language model (LLM) was generated by a specific, claim…