ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2605.07961v1· 20 results

cs.CRcs.CLcs.DCRecentApr 27, 2026

A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations

Zihan Liu, Yizhen Wang, Rui Wang, Xiu Tang +1 more

This survey provides a comprehensive, structured taxonomy of split learning techniques for fine-tuning Large Language Models (LLMs), covering model optimization, system efficiency, and privacy preserv…

View →
cs.LGcs.AIcs.CRRecentMay 11, 2026

DP-LAC: Lightweight Adaptive Clipping for Differentially Private Federated Fine-tuning of Language Models

Haaris Mehmood, Jie Xu, Karthikeyan Saravanan, Rogier Van Dalen +1 more

The paper proposes DP-LAC, a novel lightweight adaptive clipping technique for differentially private federated fine-tuning, which efficiently estimates and adapts the clipping threshold without consu…

View →
cs.CRcs.LGRecentApr 7, 2026

FedSpy-LLM: Towards Scalable and Generalizable Data Reconstruction Attacks from Gradients on LLMs

Syed Irfan Ali Meerza, Feiyi Wang, Jian Liu

FedSpy-LLM introduces a scalable and generalizable data reconstruction attack that can extract private training data from shared gradients of large language models, even when using Parameter-Efficient…

View →
cs.CRcs.LGRecentApr 17, 2026

SafeLM: Unified Privacy-Aware Optimization for Trustworthy Federated Large Language Models

Noor Islam S. Mohammad, Uluğ Bayazıt

SafeLM is a comprehensive framework that jointly addresses privacy, security, misinformation, and adversarial robustness in federated LLMs, achieving high safety performance while significantly reduci…

View →
cs.CRcs.LGRecentJun 2, 2026

RogueMerge: Robust and Unified Attacks against LLM Model Merging

Jinghuai Zhang, Yetian He, Kunlin Cai, Han Zhao +2 more

RogueMerge introduces a unified framework to robustly attack LLM model merging by addressing the challenges of autoregressive decoding, unknown merging configurations, and prompt generalization, signi…

View →
cs.CRRecentApr 1, 2026

When Safe Models Merge into Danger: Exploiting Latent Vulnerabilities in LLM Fusion

Jiaqing Li, Zhibo Zhang, Shide Zhou, Yuxi Li +2 more

The paper introduces TrojanMerge, a framework demonstrating that model merging can be exploited to systematically compromise the safety alignment of multiple individually safe LLMs.

View →
cs.CRcs.LGRecentMay 7, 2026

FedAttr: Towards Privacy-preserving Client-Level Attribution in Federated LLM Fine-tuning

Su Zhang, Junfeng Guo, Heng Huang

FedAttr introduces a novel client-level attribution protocol for Federated Learning (FL) that accurately identifies which clients trained on watermarked data while maintaining strong privacy guarantee…

View →
cs.CRcs.CLRecentApr 9, 2026

The Art of (Mis)alignment: How Fine-Tuning Methods Effectively Misalign and Realign LLMs in Post-Training

Rui Zhang, Hongwei Li, Yun Shen, Xinyue Shen +5 more

The paper investigates how various fine-tuning methods can be used both to intentionally misalign and subsequently realign large language models (LLMs), revealing distinct strengths for attack and def…

View →
cs.CRcs.AIRecentMay 27, 2026

GEO-Bench: Benchmarking Ranking Manipulation in Generative Engine Optimization

Ojas Nimase, Zhe Chen, Gengpei Qi, Yue Zhao +1 more

The paper introduces GEO-Bench, a unified benchmark that standardizes the evaluation of various generative engine optimization (GEO) ranking manipulation attacks, demonstrating that black-box content…

View →
cs.CRcs.AIRecentMay 27, 2026

GEO-Bench: Benchmarking Ranking Manipulation in Generative Engine Optimization

Ojas Nimase, Zhe Chen, Gengpei Qi, Yue Zhao +1 more

GEO-Bench introduces a standardized benchmark to compare various ranking manipulation attacks (both black-box and white-box) on generative engines, demonstrating that black-box content rewriting can b…

View →
cs.CRcs.AIRecentMar 30, 2026

Adversarial Attacks on Multimodal Large Language Models: A Comprehensive Survey

Bhavuk Jain, Sercan Ö. Arık, Hardeo K. Thakur

This survey provides a comprehensive taxonomy and vulnerability-centric analysis of adversarial attacks targeting Multimodal Large Language Models (MLLMs), offering an explanatory framework for enhanc…

View →
cs.CRcs.AIcs.LGRecentMay 24, 2026

Security in the Fine-Tuning Lifecycle of Large Language Models: Threats, Defenses,Evaluation, and Future Directions

Wenjuan Li, Yitao Liu, Runze Chen, Rajkumar Buyya

This paper provides a systematic, lifecycle-based framework for analyzing security threats and defenses across the entire fine-tuning process of LLMs, revealing that attack effectiveness is highly mod…

View →
cs.CRcs.LGRecentApr 8, 2026

FedDetox: Robust Federated SLM Alignment via On-Device Data Sanitization

Shunan Zhu, Jiawei Chen, Yonghao Yu, Hideya Ochiai

FedDetox introduces a robust framework that sanitizes toxic data on edge devices during federated learning to maintain the safety alignment of Small Language Models (SLMs) without sacrificing utility.

View →
cs.CRcs.AIRecentMay 17, 2026

When Efficiency Backfires: Cascading LLMs Trigger Cascade Failure under Adversarial Attack

Zehan Sun, Dingfan Chen, Songze Li

This paper demonstrates that LLM cascade systems, designed for efficiency, are vulnerable to targeted adversarial attacks that simultaneously degrade both performance and cost-efficiency.

View →
cs.LGcs.CRRecentMay 17, 2026

DP-SelFT: Differentially Private Selective Fine-Tuning for Large Language Models

Haichao Sha, Zihao Wang, Yuncheng Wu, Hong Chen +1 more

The paper proposes DP-SelFT, a novel framework for differentially private selective fine-tuning that significantly improves the privacy-utility trade-off for LLMs by intelligently selecting robust par…

View →
cs.LGcs.CRRecentMar 19, 2026

Revisiting Label Inference Attacks in Vertical Federated Learning: Why They Are Vulnerable and How to Defend

Yige Liu, Dexuan Xu, Zimai Guo, Yongzhi Cao +1 more

This paper analyzes label inference attacks in Vertical Federated Learning (VFL), demonstrating that existing attacks rely on feature-label distribution alignment, and proposes a zero-overhead defense…

View →
cs.CRcs.AIcs.CVRecentMay 15, 2026

DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models

Ye Sun, Xin Wang, Jiaming Zhang, Yifeng Gao +6 more

DarkLLM introduces a novel framework that uses a Large Language Model (LLM) to translate natural language instructions into flexible, latent adversarial attack vectors, demonstrating a systemic vulner…

View →
cs.CRRecentMar 17, 2026

Rotated Robustness: A Training-Free Defense against Bit-Flip Attacks on Large Language Models

Deng Liu, Song Chen

The paper introduces Rotated Robustness (RoR), a training-free defense that uses orthogonal transformations to prevent catastrophic model collapse in LLMs caused by hardware bit-flip attacks.

View →
cs.CRcs.AIRecentMar 23, 2026

Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Benchmarks

Yanming Mu, Hao Hu, Feiyang Li, Qiao Yuan +6 more

This paper provides the first comprehensive, end-to-end survey dedicated to the security of Retrieval-Augmented Generation (RAG) systems, systematically mapping threats, defenses, and benchmarks acros…

View →
cs.CRRecentApr 23, 2026

Black-Box Skill Stealing Attack from Proprietary LLM Agents: An Empirical Study

Zihan Wang, Rui Zhang, Yu Liu, Chi Liu +3 more

This paper presents the first systematic study of black-box skill stealing attacks against proprietary LLM agents, demonstrating that structured agent skills can be easily extracted, posing a signific…

View →