~ similar to 2603.20339v2· 20 results
Yiyang Zhang, Chaojian Yu, Ziming Hong, Yuanjie Shao +3 more
The paper proposes a novel Text-Guided Backdoor (TGB) attack that uses common words in text descriptions as stealthy triggers for multimodal models, enhancing practicality and controllability.
Yuchen Shi, Xin Guo, Huajie Chen, Tianqing Zhu +2 more
The paper proposes Cluster Segregation Concealment (CSC), a novel defense that identifies and neutralizes backdoor triggers by relabeling poisoned samples to a virtual class, achieving near-zero attac…
The paper introduces 'covert control attacks,' a novel and stealthy data poisoning method that teaches LLMs an information hiding scheme, allowing malicious instructions to be encoded and decoded and…
Yang Luo, Zifeng Kang, Tiantian Ji, Xinran Liu +3 more
The paper introduces SHADOWMERGE, a novel poisoning attack that successfully compromises graph-based agent memory by exploiting relation-channel conflicts, achieving a high attack success rate across…
Khang Tran, Yazan Boshmaf, Issa Khalil, NhatHai Phan +2 more
The paper introduces Poison-with-Style (PwS), a stealthy model poisoning attack that exploits developers' inherent code styles as covert triggers to make Code LLMs generate vulnerable code without exp…
The paper proposes Open-Book Benign Rewriting (OBBR), a novel defense mechanism that uses LLM rewriting with benign samples to neutralize data poisoning attacks against LLMs, significantly improving s…
The paper introduces BadSkill, a novel backdoor attack formulation that targets third-party agent skills by poisoning the embedded model artifacts, achieving high attack success rates across various m…
The paper introduces Oracle Poisoning, an attack that corrupts knowledge graphs used by AI agents, demonstrating that all tested models blindly trust poisoned data at high sophistication levels.
Luze Sun, Anshuman Suri, Harsh Chaudhari, Cristina Nita-Rotaru +1 more
The paper introduces PoisonForge, a comprehensive benchmark demonstrating that even a small number of targeted poisoned examples can significantly compromise the safety and reliability of instruction-…
The paper proposes SemBugger, a polymorphic backdoor attack that uses intensity-based poisoning to achieve diverse malicious outcomes in Semantic Communication (SC) systems, alongside a provable defen…
Duanyi Yao, Changyue Li, Zhicong Huang, Cheng Hong +1 more
The paper introduces Hidden Ads, a novel backdoor attack for Vision-Language Models (VLMs) that injects unauthorized advertisements by exploiting natural, recommendation-seeking user behaviors, mainta…
Desen Sun, Jason Hon, Howe Wang, Saarth Rajan +2 more
This paper investigates a novel security vulnerability where imperceptible branding hints can be injected into images and subsequently re-rendered onto new objects by generative AI models, proposing b…
RefineRAG introduces a novel word-level poisoning framework that significantly enhances knowledge poisoning attacks against RAG systems, achieving state-of-the-art effectiveness and transferability to…
The paper proposes a universal graph backdoor defense framework that addresses feature-based graph backdoor attacks, which are more challenging than traditional subgraph-based attacks, by leveraging l…
The paper proposes REACT, an adversarial training framework that significantly enhances the robustness and few-shot performance of machine-generated text detection by having a Retrieval-Augmented Gene…
The paper systematically evaluates static and dynamic adversarial attacks on the ALEX learned index, finding that while static poisoning has minimal impact, dynamic attacks can cause significant slowd…
Liangyi Huang, Zichen Liu, Fei Shao, Shang Ma +4 more
The paper introduces GRID, an end-to-end framework that significantly improves the construction of security knowledge graphs from cyber threat intelligence by replacing unstable LLM-based supervision…
The paper proposes PRAETORIAN, a novel defense mechanism for Graph Neural Networks (GNNs) that targets the intrinsic structural requirements of backdoor attacks, significantly reducing the attack succ…
Anjun Gao, Yueyang Quan, Yufei Xia, Zhuqing Liu +1 more
Patcher is a post-hoc defense framework that repairs backdoored large language models by localizing hidden triggers and patching the model using only a single reported failure case.
Kai Wang, Jiale Zhang, Chengcheng Zhu, Chuang Ma +1 more
The paper proposes Hydra, a framework to stabilize and control the injection of multiple, conflicting backdoor triggers into text-to-image diffusion models, ensuring high attack reliability while main…