~ similar to 2603.20637v1· 20 results
The paper introduces Oracle Poisoning, an attack that corrupts knowledge graphs used by AI agents, demonstrating that all tested models blindly trust poisoned data at high sophistication levels.
Parteek Jamwal, Minghao Shao, Boyuan Chen, Achyuta Muthuvelan +14 more
The paper introduces RAVEN, a Retrieval-Augmented Vulnerability Exploration Network, which uses LLM agents and RAG to automatically generate comprehensive, structured vulnerability analysis reports fo…
Zhe Yu, Wenpeng Xing, Gaolei Li, Shuguang Xiong +3 more
The paper introduces CORDON-MAS, a compartmentalized framework that defends Retrieval-Augmented Generation (RAG) against knowledge poisoning by enforcing strict information-flow control, significantly…
The paper evaluates four RAG architectures under knowledge base poisoning, demonstrating that advanced architectures significantly improve robustness against adversarial contradictions, localizing the…
The paper introduces an execution-grounded, cross-language framework that significantly improves the reliability of LLM-driven code vulnerability analysis by ensuring that all proposed fixes are confi…
The paper introduces STRIATUM-CTF, a modular agentic framework that uses a standardized context protocol to enable LLMs to perform multi-step, stateful reasoning for general-purpose CTF solving, achie…
Yizhe Zeng, Wei Zhang, Yunpeng Li, Juxin Xiao +2 more
MirageBackdoor introduces a novel, highly stealthy backdoor attack that forces Large Language Models to generate correct reasoning steps (Think Well) but output an incorrect final answer (Answer Wrong…
Max Hartman, Vidhata Jayaraman, Moulik Choraria, Yash Savani +1 more
The paper introduces TraceGuard, a detectability-aware antidistillation method that identifies and poisons 'thought anchors'—sparsely critical sentences—to degrade student model learning without makin…
This paper analyzes large-scale reasoning traces from LLM-based binary vulnerability analysis, identifying four structured, token-level implicit patterns that govern how LLMs explore code paths.
The paper proposes a novel '3+1' heterogeneous multi-agent architecture using cloud LLMs and a local verifier to achieve high-accuracy, cost-effective code vulnerability detection, significantly outpe…
This systematic mapping survey reviews label-efficient approaches for code vulnerability detection, synthesizing five paradigm families and providing a decision guide to navigate trade-offs.
Guangze Zhao, Yongzheng Zhang, Weilin Gai, Hongri Liu +2 more
HunterAgent is a neuro-symbolic framework that reconstructs causal attack chains from fragmented, anti-forensics-corrupted logs, achieving high accuracy while drastically reducing hallucination.
Xunguang Wang, Yuguang Zhou, Qingyue Wang, Zongjie Li +4 more
This paper introduces a novel framework, the Reasoning Safety Monitor, to detect and prevent logical inconsistencies and adversarial manipulations within the internal reasoning steps of large language…
Yiqi Wang, Jiaqi Zhang, Taotao Cai, Zirui Liu +5 more
This survey provides a systematic framework and taxonomy for evidence tracing and execution provenance in LLM agents, addressing the difficulty of verifying and auditing complex agent behaviors.
Shenao Wang, Junjie He, Yanjie Zhao, Yayi Wang +2 more
The paper introduces MalSkills, a neuro-symbolic framework that detects malicious skills in the expanding agentic supply chain by analyzing security-sensitive operations across heterogeneous artifacts…
SEAL-Tag is a privacy-preserving runtime environment that mitigates PII leakage in Retrieval-Augmented Generation (RAG) systems by enforcing verifiable evidence aggregation and structured auditing.
ZERO-APT introduces a novel closed-loop adversarial framework for automated penetration testing that simulates attacks against an intelligent, real-time defending system, achieving a high attack succe…
The paper introduces MOSAIC-Bench, a benchmark demonstrating that coding agents can ship exploitable code by complying with seemingly innocuous, staged tasks, a vulnerability that is not easily mitiga…
Chenning Li, Pan Hu, Justin Xu, Baris Ozbas +8 more
The paper introduces ADR, a novel, production-proven detection system that provides high-fidelity security monitoring for AI agents operating via the Model Context Protocol, significantly outperformin…
The paper introduces Phoenix, a training-free multi-agent framework that detects code vulnerabilities by synthesizing project-specific behavioral contracts, significantly outperforming existing method…