~ similar to 2604.05809v1· 20 results
Zida Li, Jun Li, Yuzhe Sha, Ziqiang Li +2 more
The paper introduces SET, a robust input-level backdoor detection framework that detects hidden malicious triggers in text-to-image diffusion models by analyzing systematic differences in how benign a…
The paper proposes a novel cross-modal backdoor attack that exploits the vulnerability of lightweight connectors in multimodal LLMs, demonstrating high attack success rates across different modalities…
Kai Wang, Jiale Zhang, Chengcheng Zhu, Chuang Ma +1 more
The paper proposes Hydra, a framework to stabilize and control the injection of multiple, conflicting backdoor triggers into text-to-image diffusion models, ensuring high attack reliability while main…
Duanyi Yao, Changyue Li, Zhicong Huang, Cheng Hong +1 more
The paper introduces Hidden Ads, a novel backdoor attack for Vision-Language Models (VLMs) that injects unauthorized advertisements by exploiting natural, recommendation-seeking user behaviors, mainta…
This paper introduces the Token by Token Backdoor Attack (ToBAC), demonstrating that unified autoregressive models (UAMs) are vulnerable to backdoor attacks where a single trigger can compromise multi…
Jiali Wei, Ming Fan, Guoheng Sun, Xicheng Zhang +2 more
The paper introduces BadStyle, a novel backdoor attack framework that generates natural, stealthy poisoned samples using LLMs to compromise various LLMs with high success rates and robust activation.
This survey provides a comprehensive taxonomy and vulnerability-centric analysis of adversarial attacks targeting Multimodal Large Language Models (MLLMs), offering an explanatory framework for enhanc…
The paper proposes TAGBD, a graph-aware backdoor attack that demonstrates that inconspicuous poison text alone can reliably compromise text-attributed graph learning systems.
Rui Wen, Mark Russinovich, Andrew Paverd, Jun Sakuma +1 more
The paper introduces MetaBackdoor, a novel class of LLM backdoor attacks that exploits positional encoding (length-based triggers) rather than requiring modifications to the textual content.
Ziqing Yang, Rui Wen, Xinlei He, Yun Shen +2 more
The paper introduces BadBone, a stealthy and adaptive backdoor attack that compromises a backbone model specifically to target downstream tasks utilizing prompt learning, demonstrating high attack suc…
The paper introduces SHADOWMASK, the first systematic backdoor attack targeting Masked Diffusion Language Models (MDLMs), demonstrating near-100% attack success while preserving clean model utility.
Guangsheng Zhang, Huan Tian, Leo Zhang, Tianqing Zhu +3 more
This paper systematically revisits and expands the threat model for backdoor attacks on semantic segmentation, proposing a unified framework (BADSEG) that demonstrates severe, previously overlooked vu…
Sneakdoor introduces a novel backdoor attack method that enhances stealthiness in dataset condensation by using a generative module to create input-aware triggers, achieving high attack efficacy while…
The paper proposes a secure and practical black-box text steganography method that uses a dynamic codebook and a multimodal LLM to embed secret messages into captions, outperforming existing technique…
The paper introduces Critical-CoT, a novel two-stage fine-tuning defense framework that equips LLMs with critical thinking abilities to detect and reject malicious reasoning steps introduced by advanc…
Jinghuai Zhang, Pengyue Yu, Zhexiao Lin, Kunlin Cai +2 more
ImageAuditor introduces a novel Membership Inference Attack (MIA) specifically designed for Image-based Retrieval-Augmented Generation (IRAG) systems, achieving high accuracy by addressing cross-modal…
The paper introduces BadSkill, a novel backdoor attack formulation that targets third-party agent skills by poisoning the embedded model artifacts, achieving high attack success rates across various m…
Dazhuang Liu, Yanqi Qiao, Rui Wang, Kaitai Liang +1 more
PASTA proposes a novel, twofold stealthy backdoor attack that enables high-success-rate backdoor activation across arbitrary patches in Vision Transformers by leveraging the Trigger Radiating Effect (…
Anjun Gao, Yueyang Quan, Yufei Xia, Zhuqing Liu +1 more
Patcher is a post-hoc defense framework that repairs backdoored large language models by localizing hidden triggers and patching the model using only a single reported failure case.
The paper proposes REACT, an adversarial training framework that significantly enhances the robustness and few-shot performance of machine-generated text detection by having a Retrieval-Augmented Gene…