~ similar to 2605.04209v1· 20 results
The paper demonstrates that cryptographically undetectable backdoors can be embedded into modern, state-of-the-art neural networks by exploiting inherent, latent geometric properties of the learned re…
Sneakdoor introduces a novel backdoor attack method that enhances stealthiness in dataset condensation by using a generative module to create input-aware triggers, achieving high attack efficacy while…
This paper proposes a density-aware attack that constructs triggers by placing poisoned samples in low-density regions of the clean data distribution, achieving high attack success rates even after st…
Yi Yang, Jinyang Huang, Binbin Liu, Feng-Qi Cui +4 more
The paper introduces Checkerboard, a novel, learning-free clean-label backdoor attack that efficiently poisons training data to compromise model integrity with minimal poisoning budget.
Kealan Dunnett, Reza Arablouei, Dimity Miller, Volkan Dedeoglu +1 more
The paper proposes a detection-aware adversarial fine-tuning framework to mitigate backdoor attacks in object detection models, achieving better defense while preserving clean detection performance co…
The paper demonstrates that LoRA adapters can be backdoored via data poisoning, showing the backdoor generalizes at the token feature level, and proposes robust behavioral and weight-level detectors f…
This paper demonstrates that LoRA adapters can be backdoored via data poisoning, showing that the resulting backdoor generalizes at the token feature level, and proposes robust behavioral and weight-l…
Zi Li, Tian Zhou, Wenze Li, Jingyu Hua +2 more
This paper introduces a novel supply-chain attack that uses model code backdoors to actively steal sensitive secrets from local LLM fine-tuning datasets, bypassing current privacy defenses.
The paper introduces DiffusionHijack, a supply-chain backdoor attack that compromises the PRNG used by diffusion models to deterministically control generated images, which is successfully mitigated b…
Dazhuang Liu, Yanqi Qiao, Rui Wang, Kaitai Liang +1 more
DETOUR proposes a practical backdoor attack against object detection models by using semantic triggers that are robust to variations in size, location, and field of view (FoV), overcoming limitations…
This paper introduces a dual-layer side-channel attack framework that exploits the variable workload introduced by dynamic image preprocessing in local Vision-Language Models (VLMs) to infer sensitive…
Zida Li, Jun Li, Yuzhe Sha, Ziqiang Li +2 more
The paper introduces SET, a robust input-level backdoor detection framework that detects hidden malicious triggers in text-to-image diffusion models by analyzing systematic differences in how benign a…
The paper compares two sparse autoencoder architectures, finding that Differential SAEs (Diff-SAE) significantly outperform Crosscoders in isolating backdoor-related features in language models.
Yinbo Yu, Jing Fang, Xuewen Zhang, Chunwei Tian +3 more
The paper proposes DFBScanner, a lightweight static parameter inspection framework that detects backdoor attacks by analyzing anomalous parameter updates in the final classification layer, achieving f…
The paper proposes a novel framework using the primal-dual perspective of differential privacy to provide a unified, modular, and end-to-end robustness certification for complex machine learning model…
Yinbo Yu, Xueyu Yin, Jing Fang, Chunwei Tian +3 more
The paper proposes HTell, a fast and lightweight data-free backdoor detector that analyzes the abnormal response concentration of backdoored models on the target class using random latent probes appli…
BackFlush introduces a novel, knowledge-free framework that detects and eliminates unknown backdoor attacks in LLMs while simultaneously preserving existing watermarks, achieving high detection rates…
The paper demonstrates that current AI watermark removal techniques fail to achieve true forensic stealth, as the removal process often leaves behind detectable signals that distinguish the output fro…
Yuchen Shi, Xin Guo, Huajie Chen, Tianqing Zhu +2 more
The paper proposes Cluster Segregation Concealment (CSC), a novel defense that identifies and neutralizes backdoor triggers by relabeling poisoned samples to a virtual class, achieving near-zero attac…
The paper proposes FL-PBM, a novel pre-training defense mechanism for federated learning that proactively filters poisoned data using a multi-stage process, significantly reducing backdoor attack succ…