~ similar to 2604.24975v1· 20 results
Luze Sun, Anshuman Suri, Harsh Chaudhari, Cristina Nita-Rotaru +1 more
The paper introduces PoisonForge, a comprehensive benchmark demonstrating that even a small number of targeted poisoned examples can significantly compromise the safety and reliability of instruction-…
The paper demonstrates a gray-box poisoning attack against continuous malware detection pipelines using subtle binary manipulations, showing that IAT-based perturbations can significantly degrade dete…
The paper demonstrates that current defenses against malicious fine-tuning of foundation models are insufficient because they only address fixed attacks, and introduces a unified adaptive attack that…
The paper introduces BadSkill, a novel backdoor attack formulation that targets third-party agent skills by poisoning the embedded model artifacts, achieving high attack success rates across various m…
The paper introduces XFED, a novel non-collusive model poisoning attack that demonstrates the feasibility of compromising Federated Learning systems without requiring coordination among attackers, byp…
Yuchen Shi, Xin Guo, Huajie Chen, Tianqing Zhu +2 more
The paper proposes Cluster Segregation Concealment (CSC), a novel defense that identifies and neutralizes backdoor triggers by relabeling poisoned samples to a virtual class, achieving near-zero attac…
Jack Sanderson, Yihan Wang, Xiaoqian Lu, Gautam Kamath +1 more
The paper introduces the threat model of sequential data poisoning, demonstrating that multiple, collaborating attackers can exploit compound vulnerabilities in LLM post-training pipelines that are in…
The paper constructs a large, adversarial malware dataset from real-world binaries, demonstrating high evasion rates and showing that even small amounts of poisoned data can severely compromise malwar…
Haozhen Wang, Haoyue Liu, Jionghao Zhu, Zhichao Wang +2 more
The paper introduces PIDP-Attack, a novel compound adversarial attack that combines prompt injection with database poisoning to manipulate Retrieval-Augmented Generation (RAG) systems against arbitrar…
The paper introduces Oracle Poisoning, an attack that corrupts knowledge graphs used by AI agents, demonstrating that all tested models blindly trust poisoned data at high sophistication levels.
The study demonstrates that poisoned identifier names can survive LLM deobfuscation, even when the model correctly understands the code's semantics, unless the task is reframed from deobfuscation to f…
The paper proposes the first general defense framework to make all union-preserving Differential Privacy (DP) protocols, specifically those based on shuffle-DP, resilient against poisoning attacks.
Zi Li, Tian Zhou, Wenze Li, Jingyu Hua +2 more
This paper introduces a novel supply-chain attack that uses model code backdoors to actively steal sensitive secrets from local LLM fine-tuning datasets, bypassing current privacy defenses.
Xuanye Zhang, Yongsen Zheng, Zhuqin Xu, Kaiyu Zhou +4 more
MemMorph introduces a novel memory poisoning attack that biases LLM agent tool selection by injecting crafted records into the agent's long-term memory, achieving high success rates even against moder…
This paper provides a systematic, lifecycle-based framework for analyzing security threats and defenses across the entire fine-tuning process of LLMs, revealing that attack effectiveness is highly mod…
Jiahao Chen, Zhiming Zhao, Yuwen Pu, Chunyi Zhou +3 more
This paper argues that much of the existing research on Federated Learning (FL) security is based on idealized assumptions, and provides a practical evaluation framework showing that real-world attack…
The paper systematically evaluates various defense mechanisms against persistent memory attacks on LLM agents, finding that only tool-gating at the memory layer (Memory Sandbox) effectively mitigates…
Hanzhi Liu, Chaofan Shou, Hongbo Wen, Yanju Chen +2 more
This paper systematically analyzes the threat posed by malicious third-party API routers in the LLM supply chain, finding that a significant number of routers actively perform payload injection, crede…
The paper proposes a unified closed-loop threat taxonomy to systematically analyze and defend foundation models by explicitly framing the bidirectional security interactions between data and models.
The paper proposes TAGBD, a graph-aware backdoor attack that demonstrates that inconspicuous poison text alone can reliably compromise text-attributed graph learning systems.