The paper introduces Oracle Poisoning, an attack that corrupts knowledge graphs used by AI agents, demonstrating that all tested models blindly trust poisoned data at high sophistication levels.
We define Oracle Poisoning, an attack class in which an adversary corrupts a structured knowledge graph that AI agents query at runtime via tool-use protocols, causing incorrect conclusions through correct reasoning. Unlike prompt injection, Oracle Poisoning manipulates the data agents reason over, not their instructions. We demonstrate six attack scenarios against a production 42-million-node code knowledge graph, providing the first empirical demonstration of knowledge graph poisoning against a production-scale agentic system, distinct from CTI embedding poisoning. Primary evaluation uses real SDK tool-use across nine models from three providers (N=30 per model), where models autonomously invoke a graph query tool and reason from results. The result is unambiguous: every tested model trusts poisoned data at 100% at moderate attacker sophistication(L2), with 269 valid trials (of 270) accepting fabricated security claims under directed queries. Under open-ended prompts, trust drops to 3-55%, confirming prompt framing as a confound; we report both conditions. An attacker sophistication gradient reveals discrete break points, a minimum skill at which trust flips from 0% to 100%, reframing the attack as a question not of whether but of how much. A controlled delivery-mode comparison shows that inline evaluation produces false negatives: GPT-5.1 shows 0% trust inline but 100% under both simulated and real agentic tool-use, demonstrating that delivery mode is a first-order confound. We evaluate five defences; read-only access control eliminates the direct mutation vector, while the remaining four are partial and model-dependent. Analysis of four additional platforms suggests the attack may generalise across the knowledge-graph ecosystem.
Beyond RAG for Cyber Threat Intelligence: A Systematic Evaluation of Graph-Based and Agentic Retriev…
The paper systematically evaluates advanced retrieval-augmented generation (RAG)…
PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented Gener…
The paper introduces PIDP-Attack, a novel compound adversarial attack that combi…
XFED: Non-Collusive Model Poisoning Attack Against Byzantine-Robust Federated Classifiers
The paper introduces XFED, a novel non-collusive model poisoning attack that dem…
EPDQ: Efficient and Privacy-Preserving Exact Distance Query on Encrypted Graphs
The paper proposes EPDQ, a tensor-based scheme that efficiently and privately co…
Model Context Protocol Threat Modeling and Analyzing Vulnerabilities to Prompt Injection with Tool P…
This paper analyzes the security vulnerabilities of the Model Context Protocol (…
SoK: The Attack Surface of Agentic AI -- Tools, and Autonomy
This paper systematically maps the expanded attack surface of agentic AI systems…
Graph-Aware Stealthy Poison-Text Backdoors for Text-Attributed Graphs
The paper proposes TAGBD, a graph-aware backdoor attack that demonstrates that i…
TRUSTDESC: Preventing Tool Poisoning in LLM Applications via Trusted Description Generation
The paper introduces TRUSTDESC, a novel framework that prevents tool poisoning a…