Adversarial ML
Adversarial robustness, attacks, and defenses
20 papers indexed
Toward Polymorphic Backdoor against Semantic Communication via Intensity-Based Poisoning
The paper proposes SemBugger, a polymorphic backdoor attack that uses intensity-based poisoning to achieve diverse malicious outcomes in Semantic Communication (SC) systems, alongside a provable defen…
Backdoor Threats in Variational Quantum Circuits: Taxonomy, Attacks, and Defenses
This paper surveys the security vulnerabilities of Variational Quantum Circuits (VQCs) to backdoor attacks, detailing various attack mechanisms and analyzing current detection and defense strategies.
ProtoGuard-SL: Prototype Consistency Based Backdoor Defense for Vertical Split Learning
ProtoGuard-SL introduces a server-side defense that enhances vertical split learning robustness against backdoor attacks by enforcing class-conditional consistency in the embedding space.
Checkerboard: A Simple, Effective, Efficient and Learning-free Clean Label Backdoor Attack with Low Poisoning Budget
Yi Yang, Jinyang Huang, Binbin Liu, Feng-Qi Cui +4 more
The paper introduces Checkerboard, a novel, learning-free clean-label backdoor attack that efficiently poisons training data to compromise model integrity with minimal poisoning budget.
Dummy-Aware Weighted Attack (DAWA): Breaking the Safe Sink in Dummy Class Defenses
Yunrui Yu, Xuxiang Feng, Pengda Qin, Pengyang Wang +4 more
The paper introduces Dummy-Aware Weighted Attack (DAWA), a novel evaluation method that significantly reduces the reported robustness of Dummy Classes-based defenses by simultaneously targeting both t…
SORA: Free Second-Order Attacks in Fast Adversarial Training
The paper introduces SORA, an adaptive adversarial training method that dynamically adjusts perturbation sizes to prevent Catastrophic Overfitting, achieving state-of-the-art robustness and clean accu…
When Interpretability Becomes a Liability: Adversarial Attacks on CBM Concept Layers
This paper demonstrates that Concept Bottleneck Models (CBMs), despite their interpretability, are highly vulnerable to targeted adversarial attacks that manipulate semantic concepts, and proposes SPE…
Backdoor Attacks on Decentralised Post-Training
This paper introduces the first backdoor attack specifically targeting pipeline parallelism in decentralized post-training, demonstrating that a limited adversary controlling an intermediate stage can…
CSC: Turning the Adversary's Poison against Itself
Yuchen Shi, Xin Guo, Huajie Chen, Tianqing Zhu +2 more
The paper proposes Cluster Segregation Concealment (CSC), a novel defense that identifies and neutralizes backdoor triggers by relabeling poisoned samples to a virtual class, achieving near-zero attac…
Cross-Modal Backdoors in Multimodal Large Language Models
The paper proposes a novel cross-modal backdoor attack that exploits the vulnerability of lightweight connectors in multimodal LLMs, demonstrating high attack success rates across different modalities…
BiRD: A Bidirectional Ranking Defense Mechanism for Retrieval Augmented Generation
Chengcai Gao, Zhihong Sun, Xiaochuan Shi, Qiufeng Wang +1 more
The paper proposes BiRD, a bidirectional ranking defense mechanism that enhances the robustness of Retrieval-Augmented Generation (RAG) against adversarial attacks by analyzing the alignment between f…
Provable Robustness against Backdoor Attacks via the Primal-Dual Perspective on Differential Privacy
The paper proposes a novel framework using the primal-dual perspective of differential privacy to provide a unified, modular, and end-to-end robustness certification for complex machine learning model…
Follow My Eyes: Backdoor Attacks on VLM-based Scanpath Prediction
Diana Romero, Mutahar Ali, Momin Ahmad Khan, Habiba Farrukh +2 more
This paper introduces the first backdoor attacks against VLM-based scanpath prediction, demonstrating variable-output attacks that evade detection and survive deployment on edge devices.
DeepSeek Robustness Against Semantic-Character Dual-Space Mutated Prompt Injection
The paper introduces PromptFuzz-SC, a novel semantic-character dual-space mutation framework, demonstrating that combining both semantic and character-level attacks significantly improves the robustne…
On the Vulnerability of Deep Automatic Modulation Classifiers to Explainable Backdoor Threats
This paper investigates a novel physical backdoor attack against Deep Automatic Modulation Classifiers (AMC) in wireless communications, demonstrating that an adversary using Explainable AI (XAI) can…
Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Benchmarks
Yanming Mu, Hao Hu, Feiyang Li, Qiao Yuan +6 more
This paper provides the first comprehensive, end-to-end survey dedicated to the security of Retrieval-Augmented Generation (RAG) systems, systematically mapping threats, defenses, and benchmarks acros…
Backdoor Attacks on Fault Detection and Localization in Cyber-Physical Systems
This paper investigates the vulnerability of machine learning-based fault detection and localization systems in Cyber-Physical Systems (CPS) to backdoor attacks, demonstrating that such attacks are su…
Beyond Corner Patches: Semantics-Aware Backdoor Attack in Federated Learning
This paper proposes SABLE, a method for generating semantically meaningful and in-distribution backdoor triggers for federated learning, demonstrating that such attacks remain a potent and practical t…
Security in the Fine-Tuning Lifecycle of Large Language Models: Threats, Defenses,Evaluation, and Future Directions
This paper provides a systematic, lifecycle-based framework for analyzing security threats and defenses across the entire fine-tuning process of LLMs, revealing that attack effectiveness is highly mod…
Patcher: Post-Hoc Patching of Backdoored Large Language Models
Anjun Gao, Yueyang Quan, Yufei Xia, Zhuqing Liu +1 more
Patcher is a post-hoc defense framework that repairs backdoored large language models by localizing hidden triggers and patching the model using only a single reported failure case.