ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2604.27487v1· 18 results

cs.LGcs.CLRecentMay 28, 2026

CSULoRA: Closest Safe Update Low-Rank Adaptation

Oleksandr Marchenko Breneur, Adelaide Danilov, Aria Nourbakhsh, Salima Lamsiyah

CSULoRA is a post-hoc method that corrects trained LoRA adapters by estimating a safety-aligned subspace and solving a penalized minimum-change problem to attenuate unsafe update directions while pres…

View →
cs.LGcs.AIcs.CVRecentMay 30, 2026

SORA: Free Second-Order Attacks in Fast Adversarial Training

Mazdak Teymourian, Ramtin Moslemi, Farzan Rahmani, Mohammad Hossein Rohban

The paper introduces SORA, an adaptive adversarial training method that dynamically adjusts perturbation sizes to prevent Catastrophic Overfitting, achieving state-of-the-art robustness and clean accu…

View →
cs.LGcs.AIcs.CRRecentMay 6, 2026

Information Theoretic Adversarial Training of Large Language Models

Yiwei Zhang, Jeremiah Birrell, Reza Ebrahimi, Rouzbeh Behnia +2 more

The paper proposes WARDEN, a distributionally robust adversarial training framework that significantly reduces LLM vulnerability to adversarial attacks by dynamically reweighting hard adversarial exam…

View →
cs.CRcs.AIRecentMay 17, 2026

When Efficiency Backfires: Cascading LLMs Trigger Cascade Failure under Adversarial Attack

Zehan Sun, Dingfan Chen, Songze Li

This paper demonstrates that LLM cascade systems, designed for efficiency, are vulnerable to targeted adversarial attacks that simultaneously degrade both performance and cost-efficiency.

View →
cs.CRcs.AIRecentMar 30, 2026

Adversarial Attacks on Multimodal Large Language Models: A Comprehensive Survey

Bhavuk Jain, Sercan Ö. Arık, Hardeo K. Thakur

This survey provides a comprehensive taxonomy and vulnerability-centric analysis of adversarial attacks targeting Multimodal Large Language Models (MLLMs), offering an explanatory framework for enhanc…

View →
cs.CRcs.AIcs.CVRecentMay 15, 2026

DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models

Ye Sun, Xin Wang, Jiaming Zhang, Yifeng Gao +6 more

DarkLLM introduces a novel framework that uses a Large Language Model (LLM) to translate natural language instructions into flexible, latent adversarial attack vectors, demonstrating a systemic vulner…

View →
cs.CRRecentMay 8, 2026

Improving Parameter-Efficient Federated Learning with Differentially Private Refactorization

Linh Tran, Ana Milanova, Stacy Patterson

The paper proposes FedPower, a novel differentially private cross-silo Federated Learning framework that uses PowerDP to reconstruct and project client updates into a secure low-rank space, effectivel…

View →
cs.CRcs.CVcs.LGRecentMay 13, 2026

LoREnc: Low-Rank Encryption for Securing Foundation Models and LoRA Adapters

Beomjin Ahn, Jungmin Kwon, Chanyong Jung, Jaewook Chung

LoREnc is a novel, training-free framework that secures Foundation Models (FMs) and LoRA adapters against intellectual property leakage and model recovery attacks by spectrally truncating weights and…

View →
cs.LGcs.AIRecentMay 27, 2026

Efficient Pre-Training of LLMs through Truncated SVD Layers

Kaivan Kamali, Kajetan Schweighofer, Hormoz Shahrzad, Olivier Francon +2 more

The paper introduces TSVD, a novel framework that efficiently pre-trains LLMs by enforcing both low rank and strict weight orthonormality, achieving performance comparable to full-parameter models wit…

View →
cs.SDcs.AIcs.CRRecentJun 4, 2026

Beyond Waveform Robustness: Robust Feature-Vocoder Adversarial Attacks on Automatic Speech Recognition

Yifan Liao, Zongmin Zhang, Zhen Sun, Yuhui Sun +2 more

The paper introduces a novel Clean-Referenced Feature-Vocoder Attack, a black-box adversarial attack that perturbs high-level SSL feature representations instead of raw audio waveforms, achieving supe…

View →
cs.CRcs.AIcs.LGRecentMay 24, 2026

Security in the Fine-Tuning Lifecycle of Large Language Models: Threats, Defenses,Evaluation, and Future Directions

Wenjuan Li, Yitao Liu, Runze Chen, Rajkumar Buyya

This paper provides a systematic, lifecycle-based framework for analyzing security threats and defenses across the entire fine-tuning process of LLMs, revealing that attack effectiveness is highly mod…

View →
cs.CRcs.AIRecentMay 11, 2026

Threat Modelling using Domain-Adapted Language Models: Empirical Evaluation and Insights

Saba Pourhanifeh, AbdulAziz AbdulGhaffar, Ashraf Matrawy

The paper empirically evaluates domain-adapted and general-purpose LLMs for structured threat modelling (STRIDE on 5G security), finding that domain adaptation and model size do not guarantee reliable…

View →
cs.LGcs.CRcs.CVRecentMay 22, 2026

Sample-wise Targeted Adversarial Attacks on Test-time Adaptation

Phuc Duc Nguyen, Quang Duc Nguyen

The paper introduces a sample-wise targeted adversarial attack that successfully misclassifies only specific, triggered inputs during test-time adaptation while maintaining the overall label distribut…

View →
cs.AIRecentMay 28, 2026

Harnessing non-adversarial robustness in large language models

Qinghua Zhou, Ellina Aleshina, Andrey Lovyagin, Oleg Somov +5 more

The paper proposes a debiasing fine-tuning technique to efficiently enhance the robustness of Large Language Models against semantically similar but textually altered prompts.

View →
cs.CRcs.AIcs.LGRecentMay 22, 2026

Adversarial Vulnerability Under Temporal Concept Drift: A Longitudinal Study of Android Malware Detection

Ahmed Sabbah, Mohammed Kharma, Radi Jarrar, Samer Zein +1 more

This study longitudinally evaluates the adversarial robustness of Android malware detection systems over a decade, finding that temporal separation significantly degrades robustness due to concept dri…

View →
cs.CRcs.CLRecentApr 24, 2026

Training a General Purpose Automated Red Teaming Model

Aishwarya Padmakumar, Leon Derczynski, Traian Rebedea, Christopher Parisien

The paper proposes a general-purpose pipeline to train automated red teaming models capable of generating attacks for arbitrary adversarial goals, overcoming the limitations of current methods that ar…

View →
cs.CRcs.AIRecentMay 27, 2026

GEO-Bench: Benchmarking Ranking Manipulation in Generative Engine Optimization

Ojas Nimase, Zhe Chen, Gengpei Qi, Yue Zhao +1 more

The paper introduces GEO-Bench, a unified benchmark that standardizes the evaluation of various generative engine optimization (GEO) ranking manipulation attacks, demonstrating that black-box content…

View →
cs.CRcs.AIRecentMay 27, 2026

GEO-Bench: Benchmarking Ranking Manipulation in Generative Engine Optimization

Ojas Nimase, Zhe Chen, Gengpei Qi, Yue Zhao +1 more

GEO-Bench introduces a standardized benchmark to compare various ranking manipulation attacks (both black-box and white-box) on generative engines, demonstrating that black-box content rewriting can b…

View →