~ similar to 2605.31246v1· 20 results
Zeyao Liu, Zhendong Zhao, Xiaojun Chen, Xin Zhao +2 more
The paper introduces VIPER, a novel backdoor attack framework that exploits the functional fusion of malicious and benign logic within dynamic prompt architectures, demonstrating a new, high-risk thre…
CLIP-Inspector (CI) is a novel model-level backdoor detection method that reconstructs potential triggers using out-of-distribution (OOD) images to verify the security of prompt-tuned CLIP models.
Guangsheng Zhang, Huan Tian, Leo Zhang, Tianqing Zhu +3 more
This paper systematically revisits and expands the threat model for backdoor attacks on semantic segmentation, proposing a unified framework (BADSEG) that demonstrates severe, previously overlooked vu…
Zida Li, Jun Li, Yuzhe Sha, Ziqiang Li +2 more
The paper introduces SET, a robust input-level backdoor detection framework that detects hidden malicious triggers in text-to-image diffusion models by analyzing systematic differences in how benign a…
The paper introduces BadSkill, a novel backdoor attack formulation that targets third-party agent skills by poisoning the embedded model artifacts, achieving high attack success rates across various m…
Yiyang Zhang, Chaojian Yu, Ziming Hong, Yuanjie Shao +3 more
The paper proposes a novel Text-Guided Backdoor (TGB) attack that uses common words in text descriptions as stealthy triggers for multimodal models, enhancing practicality and controllability.
This paper proposes SABLE, a method for generating semantically meaningful and in-distribution backdoor triggers for federated learning, demonstrating that such attacks remain a potent and practical t…
Duanyi Yao, Changyue Li, Zhicong Huang, Cheng Hong +1 more
The paper introduces Hidden Ads, a novel backdoor attack for Vision-Language Models (VLMs) that injects unauthorized advertisements by exploiting natural, recommendation-seeking user behaviors, mainta…
The paper demonstrates that LoRA adapters can be backdoored via data poisoning, showing the backdoor generalizes at the token feature level, and proposes robust behavioral and weight-level detectors f…
This paper demonstrates that LoRA adapters can be backdoored via data poisoning, showing that the resulting backdoor generalizes at the token feature level, and proposes robust behavioral and weight-l…
The paper introduces Critical-CoT, a novel two-stage fine-tuning defense framework that equips LLMs with critical thinking abilities to detect and reject malicious reasoning steps introduced by advanc…
The security of LLM agents is critically dependent on their system prompt configuration, which creates a brittle attack surface that can be exploited by attackers inverting the prompt's core assumptio…
Jiali Wei, Ming Fan, Guoheng Sun, Xicheng Zhang +2 more
The paper introduces BadStyle, a novel backdoor attack framework that generates natural, stealthy poisoned samples using LLMs to compromise various LLMs with high success rates and robust activation.
Kai Wang, Jiale Zhang, Chengcheng Zhu, Chuang Ma +1 more
The paper proposes Hydra, a framework to stabilize and control the injection of multiple, conflicting backdoor triggers into text-to-image diffusion models, ensuring high attack reliability while main…
Kealan Dunnett, Reza Arablouei, Dimity Miller, Volkan Dedeoglu +1 more
The paper proposes a detection-aware adversarial fine-tuning framework to mitigate backdoor attacks in object detection models, achieving better defense while preserving clean detection performance co…
This paper proposes a density-aware attack that constructs triggers by placing poisoned samples in low-density regions of the clean data distribution, achieving high attack success rates even after st…
Yinbo Yu, Jing Fang, Xuewen Zhang, Chunwei Tian +3 more
The paper proposes DFBScanner, a lightweight static parameter inspection framework that detects backdoor attacks by analyzing anomalous parameter updates in the final classification layer, achieving f…
Mohan Zhang, Yuqi Jia, Zhen Tan, Steven Jiang +3 more
This study provides the first systematic measurement of prompt injection attacks in a real-world LLM-based resume screening application, finding that approximately 1% of resumes contain hidden injecti…
Mohan Zhang, Yuqi Jia, Zhen Tan, Steven Jiang +3 more
This study provides the first large-scale measurement of prompt injection attacks in real-world LLM-based resume screening, finding that approximately 1% of resumes contain hidden injections.
Oubo Ma, Ruixiao Lin, Yang Dai, Jiahao Chen +3 more
This paper systematically investigates how various plasticity interventions affect the vulnerability of deep reinforcement learning agents to backdoor attacks, finding that most interventions mitigate…