Papers similar to 2606.03811v1

~ similar to 2606.03811v1· 20 results

cs.CRRecentMay 4, 2026

Autonomous LLM Agent Worms: Cross-Platform Propagation, Automated Discovery and Temporal Re-Entry Defense

The paper introduces a systematic framework and defense mechanisms to analyze and mitigate autonomous LLM agent worms that propagate through persistent agent state and cross-platform multi-agent syste…

View →

cs.CRcs.AIcs.LGRecentMay 28, 2026

Honeyval: A Comprehensive Evaluation Framework for LLM-powered HTTP Honeypots

Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more

The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these honeypots provide substantially longer and harder-to-detect…

View →

cs.CRcs.AIcs.LGRecentMay 28, 2026

Honeyval: A Comprehensive Evaluation Framework for LLM-powered HTTP Honeypots

Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more

The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these systems provide substantially longer and harder-to-detect i…

View →

cs.CRcs.AIcs.CLRecentMay 12, 2026

Can a Single Message Paralyze the AI Infrastructure? The Rise of AbO-DDoS Attacks through Targeted Mobius Injection

Zi Liang, Ronghua Li, Yanyun Wang, Qingqing Ye +1 more

This paper introduces Mobius Injection, a novel, lightweight attack that weaponizes autonomous LLM agents into zombie nodes to launch highly scalable AbO-DDoS attacks by exploiting a vulnerability cal…

View →

cs.CRcs.AIcs.HCRecentMay 6, 2026

Agentic AI and the Industrialization of Cyber Offense: Forecast, Consequences, and Defensive Priorities for Enterprises and the Mittelstand

Christopher Koch

The paper forecasts that agentic AI will compress the cyber attack lifecycle by lowering the cost of multiple attack stages, necessitating immediate operational security upgrades for enterprises and t…

View →

cs.CRRecentMay 3, 2026

AgenticVM: Agentic AI for Adaptive Software Vulnerability Management

Asrul Arifin, Hussain Ahmad, Yiyao Zhang, Diksha Goel

AgenticVM is a multi-agent framework that uses LLMs and specialized tools to automate and drastically reduce the volume of software vulnerabilities into actionable, prioritized queues.

View →

cs.CRRecentApr 9, 2026

Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain

Hanzhi Liu, Chaofan Shou, Hongbo Wen, Yanju Chen +2 more

This paper systematically analyzes the threat posed by malicious third-party API routers in the LLM supply chain, finding that a significant number of routers actively perform payload injection, crede…

View →

cs.CRcs.AIcs.LGRecentMay 11, 2026

ExploitGym: Can AI Agents Turn Security Vulnerabilities into Real Attacks?

Zhun Wang, Nico Schiller, Hongwei Li, Srijiith Sesha Narayana +12 more

The paper introduces ExploitGym, a large-scale benchmark, demonstrating that advanced AI agents can successfully turn theoretical software vulnerabilities into working exploits, highlighting growing c…

View →

cs.CRcs.AIRecentApr 27, 2026

AgentWard: A Lifecycle Security Architecture for Autonomous AI Agents

Yixiang Zhang, Xinhao Deng, Jiaqing Wu, Yue Xiao +2 more

The paper introduces AgentWard, a lifecycle-oriented, defense-in-depth architecture designed to systematically secure autonomous AI agents by protecting them across all stages of their operation.

View →

cs.CRcs.AIRecentMar 29, 2026

A Security Analysis of the OpenClaw AI Agent Framework

Surada Suwansathit, Yuxuan Zhang, Guofei Gu

This paper analyzes 470 security advisories in the OpenClaw AI agent framework, demonstrating that the system's structural weakness lies in per-layer trust enforcement, enabling cross-layer remote cod…

View →

cs.CRRecentMay 14, 2026

Toward Securing AI Agents Like Operating Systems

Lukas Pirch, Micha Horlboge, Patrick Großmann, Syeda Mahnur Asif +3 more

This paper analyzes the security of LLM-based autonomous agents by drawing parallels to operating system security, finding that while some vulnerabilities are inherent, many can be mitigated using est…

View →

cs.CRcs.AIRecentMay 26, 2026

Lessons from Penetration Tests on Large-Scale Agent Systems

Kevin Eykholt, Dhilung Kirat, Xiaokui Shu, Jiyong Jang +2 more

The paper reports on penetration tests conducted on proprietary, large-scale AI agent systems, finding that security vulnerabilities persist despite stricter development standards.

View →

cs.CRcs.OSRecentApr 20, 2026

AgenTEE: Confidential LLM Agent Execution on Edge Devices

Sina Abdollahi, Mohammad M Maheri, Javad Forough, Amir Al Sadi +4 more

AgenTEE is a system that enables the secure, confidential execution of complex LLM agent pipelines directly on edge devices by using isolated confidential virtual machines.

View →

cs.CRcs.AIRecentMay 18, 2026

Agent Security is a Systems Problem

Mihai Christodorescu, Earlence Fernandes, Ashish Hooda, Somesh Jha +10 more

The paper argues that agent security must be treated as a systems problem, requiring the enforcement of security invariants at the system level rather than solely relying on improving the underlying A…

View →

cs.CRcs.CLRecentMay 30, 2026

"I Strongly Suspect This Website Is a Scam": Benchmarking PII Leakage and Detection without Defense in Autonomous Web Agents

Soham Roy, Sarthakbrata Halder, Arya Bharaty, Vaibhav Bhaskar +4 more

The paper demonstrates that autonomous web agents are highly susceptible to social-engineering attacks, leaking critical PII even when they internally flag a site as suspicious, necessitating output-l…

View →

cs.CRcs.CLRecentMay 30, 2026

"I Strongly Suspect This Website Is a Scam": Benchmarking PII Leakage and Detection without Defense in Autonomous Web Agents

Soham Roy, Sarthakbrata Halder, Arya Bharaty, Vaibhav Bhaskar +4 more

View →

cs.CRRecentApr 27, 2026

Dynamic Cyber Ranges

Víctor Mayoral-Vilches, María Sanz-Gómez, Francesco Balassone, Maite Del Mundo De Torres +5 more

The paper proposes Dynamic Cyber Ranges, an advanced cyber range environment using LLM-driven Defender agents to counter the saturation of traditional security benchmarks, demonstrating that these dyn…

View →

cs.CRRecentApr 24, 2026

Automation-Exploit: A Multi-Agent LLM Framework for Adaptive Offensive Security with Digital Twin-Based Risk-Mitigated Exploitation

Biagio Andreucci, Arcangelo Castiglione

Automation-Exploit is a multi-agent LLM framework that enables adaptive offensive security by using a digital twin to safely test and execute high-risk memory-corruption exploits on live targets.

View →

cs.CYcs.CRRecentMar 31, 2026

Stand-Alone Complex or Vibercrime? Exploring the adoption and innovation of GenAI tools, coding assistants, and agents within cybercrime ecosystems

Jack Hughes, Ben Collier, Daniel R. Thomas

The paper analyzes the real threat of GenAI in cybercrime, arguing that while high-end automation (Stand-Alone Complex) is possible, current adoption is low and primarily affects skilled actors, sugge…

View →

cs.CRcs.AIRecentMay 20, 2026

PocketAgents: A Manifest-Driven Library of Autonomous Defense Agents

Sidnei Barbieri, Ágney Lopes Roth Ferraz, Lourenço Alves Pereira Júnior

PocketAgents introduces a manifest-driven framework for autonomous defense agents, enabling measurable and attributable LLM-driven security responses by strictly controlling agent actions and telemetr…

View →