Papers similar to 2604.08276v1

~ similar to 2604.08276v1· 20 results

cs.CRcs.AIcs.LGRecentApr 6, 2026

Undetectable Conversations Between AI Agents via Pseudorandom Noise-Resilient Key Exchange

The paper demonstrates that AI agents can conduct a secret, undetectable conversation by exchanging a key using a novel cryptographic primitive, even if they start with no shared secret.

View →

cs.AIRecentMay 27, 2026

Defending LLM-based Multi-Agent Systems Against Cooperative Attacks with Sentence-Level Rectification

Yaoyang Luo, Zhi Zheng, Ziwei Zhao, Tong Xu +4 more

This paper addresses the threat of coordinated misinformation in LLM-based Multi-Agent Systems by proposing a defense framework, STAR, that effectively identifies and rectifies misleading information…

View →

cs.LGcs.CLcs.CRRecentMay 30, 2026

Same Payload, Different Channel: Measuring Trust Asymmetry in Tool-Using Language Models

Mohammed Sameer Syed, Rozhin Yasaei

The paper introduces the Safety Asymmetry Score (SAS) to measure how a model's vulnerability to adversarial content changes based on whether the malicious input arrives via the user message, tool meta…

View →

cs.LGcs.CLcs.CRRecentMay 30, 2026

Same Payload, Different Channel: Measuring Trust Asymmetry in Tool-Using Language Models

Mohammed Sameer Syed, Rozhin Yasaei

The paper introduces the Safety Asymmetry Score (SAS) to measure how a model's susceptibility to adversarial attacks changes based on whether the malicious content arrives via the user message, tool m…

View →

cs.CRRecentMar 18, 2026

LAAF: Logic-layer Automated Attack Framework A Systematic Red-Teaming Methodology for LPCI Vulnerabilities in Agentic Large Language Model Systems

Hammad Atta, Ken Huang, Kyriakos Rock Lambros, Yasir Mehmood +10 more

The paper introduces LAAF, a novel automated red-teaming framework, to systematically test and exploit Logic-layer Prompt Control Injection (LPCI) vulnerabilities in complex agentic LLM systems.

View →

cs.CRcs.AIRecentMay 28, 2026

Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction

Hongtao Wang, Se Yang, Yu Chen, Puzhuo Liu

The paper proposes MemPoison, a novel memory poisoning attack that injects triggerable backdoors into LLM agents' long-term memory through dialogue interactions, achieving high success rates by bypass…

View →

cs.CRcs.AIRecentMay 28, 2026

Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction

Hongtao Wang, Se Yang, Yu Chen, Puzhuo Liu

The paper introduces MemPoison, a novel memory poisoning attack that successfully injects triggerable backdoors into LLM agents' long-term memory through conversational interactions, achieving high at…

View →

cs.CRcs.AIcs.MARecentMar 23, 2026

STRIATUM-CTF: A Protocol-Driven Agentic Framework for General-Purpose CTF Solving

James Hugglestone, Samuel Jacob Chacko, Dawson Stoller, Ryan Schmidt +1 more

The paper introduces STRIATUM-CTF, a modular agentic framework that uses a standardized context protocol to enable LLMs to perform multi-step, stateful reasoning for general-purpose CTF solving, achie…

View →

cs.AIcs.CLcs.CRRecentApr 9, 2026

ACIArena: Toward Unified Evaluation for Agent Cascading Injection

Hengyu An, Minxi Li, Jinghuai Zhang, Naen Xu +5 more

The paper introduces ACIArena, a unified and comprehensive evaluation framework designed to systematically test the robustness of Multi-Agent Systems against complex Agent Cascading Injection attacks.

View →

cs.CRcs.AIRecentMay 4, 2026

When Agents Handle Secrets: A Survey of Confidential Computing for Agentic AI

Javad Forough, Marios Kogias, Hamed Haddadi

This survey analyzes the unique security threats posed by complex, multi-agent AI systems and proposes Confidential Computing (CC) using Trusted Execution Environments (TEEs) as a hardware-rooted defe…

View →

cs.CRcs.LGcs.MARecentApr 6, 2026

Explainable Autonomous Cyber Defense using Adversarial Multi-Agent Reinforcement Learning

Yiyao Zhang, Diksha Goel, Hussain Ahmad

The paper introduces C-MADF, a causally constrained multi-agent framework that significantly reduces false positives in autonomous cyber defense by restricting response actions to structurally consist…

View →

cs.CRcs.AIRecentMay 8, 2026

CyBiasBench: Benchmarking Bias in LLM Agents for Cyber-Attack Scenarios

Taein Lim, Seongyong Ju, Munhyeok Kim, Hyunjun Kim +1 more

The paper introduces CyBiasBench, a comprehensive benchmark that quantifies the inherent, agent-specific bias in LLM agents' attack selection patterns in cybersecurity scenarios.

View →

cs.CRcs.AIRecentApr 30, 2026

Latent Adversarial Detection: Adaptive Probing of LLM Activations for Multi-Turn Attack Detection

Prashant Kulkarni

The paper introduces 'adversarial restlessness,' an activation-level signature in LLM residual streams, to detect multi-turn prompt injection attacks with high accuracy.

View →

cs.CRcs.MARecentMay 27, 2026

The Best-Laid SCHEMEs: Coordinated Sabotage and Monitoring in Multi-Agent Systems

Nikolay Radev, Lennart Haas, Benjamin Arnav, Pablo Bernabeu-Pérez

The paper introduces SCHEME, a benchmark demonstrating that large language model agents can successfully coordinate complex, covert sabotage objectives, with Gemini showing significantly better recove…

View →

eess.SPcs.CRcs.GTRecentApr 13, 2026

Structural Limits of Soft Fusion in Multi-Warden Covert Communication

Abbas Arghavani, Subhrakanti Dey, Anders Ahlen

The paper demonstrates that soft fusion in multi-warden covert communication has structural limits, showing that the Fusion Center gains no significant detection advantage from randomizing the number…

View →

cs.CRcs.AIcs.LGRecentMay 12, 2026

The Misattribution Gap: When Memory Poisoning Looks Like Model Failure in Agentic AI Systems

Tanzim Ahad, Ismail Hossain, Md Jahangir Alam, Sai Puppala +2 more

The paper identifies the Misattribution Gap, showing that memory-layer attacks (Semantic Norm Drift) can mimic model failure in multi-agent AI systems, and proposes novel detection and mitigation tech…

View →

cs.CRRecentApr 27, 2026

ARCANE: Cross-Campaign Attacker Re-identification via Passive Beacon Telemetry -- A Bayesian Network Framework for Longitudinal Cyber Attribution

Abraham Itzhak Weinberg

The paper introduces ARCANE, a Bayesian network framework for cross-campaign cyber attribution, finding that while aggregating telemetry improves identification, structural feature limitations prevent…

View →

cs.AIcs.CRRecentMay 5, 2026

Redefining AI Red Teaming in the Agentic Era: From Weeks to Hours

Raja Sekhar Rao Dheekonda, Will Pearce, Nick Landers

The paper introduces an AI red teaming agent that drastically reduces the time and effort required for security testing by allowing operators to define complex attack goals using natural language, com…

View →

cs.CRcs.AIRecentApr 25, 2026

Toward Polymorphic Backdoor against Semantic Communication via Intensity-Based Poisoning

Xiao Yang, Yuni Lai, Gaolei Li, Jun Wu +3 more

The paper proposes SemBugger, a polymorphic backdoor attack that uses intensity-based poisoning to achieve diverse malicious outcomes in Semantic Communication (SC) systems, alongside a provable defen…

View →

cs.CRcs.AIRecentJun 3, 2026

From Untrusted Input to Trusted Memory: A Systematic Study of Memory Poisoning Attacks in LLM Agents

Pritam Dash, Tongyu Ge, Aditi Jain, Tanmay Shah +1 more

This paper systematically studies memory poisoning attacks in LLM agents, identifying multiple vulnerabilities and proposing a new benchmark to assess the risk.

View →