~ similar to 2605.23168v1· 20 results
Khang Tran, Yazan Boshmaf, Issa Khalil, NhatHai Phan +2 more
The paper introduces Poison-with-Style (PwS), a stealthy model poisoning attack that exploits developers' inherent code styles as covert triggers to make Code LLMs generate vulnerable code without exp…
The paper introduces BadSkill, a novel backdoor attack formulation that targets third-party agent skills by poisoning the embedded model artifacts, achieving high attack success rates across various m…
The paper introduces Oracle Poisoning, an attack that corrupts knowledge graphs used by AI agents, demonstrating that all tested models blindly trust poisoned data at high sophistication levels.
Yuting Ning, Zhehao Zhang, Yash Kumar Lal, Boyu Gou +7 more
The paper introduces SkillHarm, a comprehensive benchmark and automated framework for evaluating skill-based attacks across the entire agent skill-use lifecycle, demonstrating that current agents rema…
The paper introduces a comprehensive taxonomy and auditing framework to assess the collective coverage of existing LLM attack benchmarks, revealing significant and systematic gaps in current testing m…
This paper provides a systematic, lifecycle-based framework for analyzing security threats and defenses across the entire fine-tuning process of LLMs, revealing that attack effectiveness is highly mod…
The paper systematically evaluates static and dynamic adversarial attacks on the ALEX learned index, finding that while static poisoning has minimal impact, dynamic attacks can cause significant slowd…
The paper demonstrates a gray-box poisoning attack against continuous malware detection pipelines using subtle binary manipulations, showing that IAT-based perturbations can significantly degrade dete…
The paper introduces a challenging benchmark for LLM agents to perform unsupervised threat hunting on raw Windows event logs, finding that current frontier models perform poorly and are not ready for…
Wenjie Xiao, Xuehai Tang, Biyu Zhou, Songlin Hu +1 more
RouteGuard is a novel detector that identifies skill poisoning in LLM agents by monitoring structured internal attention shifts, achieving high detection rates on critical skill-injection attacks.
Shi Liu, Xuehai Tang, Xikang Yang, Liang Lin +3 more
This paper introduces a new benchmark to test Tool Description Poisoning (TDP) attacks on LLM agents, demonstrating that even advanced models like GPT-4o are highly vulnerable and that current defense…
Jack Sanderson, Yihan Wang, Xiaoqian Lu, Gautam Kamath +1 more
The paper introduces the threat model of sequential data poisoning, demonstrating that multiple, collaborating attackers can exploit compound vulnerabilities in LLM post-training pipelines that are in…
The paper introduces TRUSTDESC, a novel framework that prevents tool poisoning attacks in LLM applications by automatically generating highly accurate and trusted tool descriptions directly from the t…
Huiyu Xu, Zhibo Wang, Wenhui Zhang, Ziqi Zhu +3 more
The paper introduces LoopTrap, an automated red-teaming framework that demonstrates how malicious prompts can poison the termination judgment of LLM agents, causing unbounded computation.
Wei Zou, Mingwen Dong, Miguel Romero Calvo, Shuaichen Chang +6 more
The paper introduces eTAMP, a novel attack that poisons LLM web agents' memory using only environmental observations, demonstrating cross-site and cross-session compromise without direct memory access…
Chang Jin, An Wang, Zeming Wei, Kai Wang +6 more
The paper introduces SkillSafetyBench, a comprehensive benchmark demonstrating that agent safety failures often stem from adversarial influences within reusable skills and execution environments, rath…
Yang Luo, Zifeng Kang, Tiantian Ji, Xinran Liu +3 more
The paper introduces SHADOWMERGE, a novel poisoning attack that successfully compromises graph-based agent memory by exploiting relation-channel conflicts, achieving a high attack success rate across…
Yukun Jiang, Yage Zhang, Michael Backes, Xinyue Shen +1 more
This paper presents HarmfulSkillBench, a large-scale benchmark demonstrating that even small percentages of publicly available skills can be misused for harmful actions, significantly lowering LLM ref…
This paper demonstrates that LLM cascade systems, designed for efficiency, are vulnerable to targeted adversarial attacks that simultaneously degrade both performance and cost-efficiency.
Xuanye Zhang, Yongsen Zheng, Zhuqin Xu, Kaiyu Zhou +4 more
MemMorph introduces a novel memory poisoning attack that biases LLM agent tool selection by injecting crafted records into the agent's long-term memory, achieving high success rates even against moder…