~ similar to 2605.09504v1· 20 results
The paper systematically maps LLM agent vulnerabilities by testing 10,000 prompt variations, finding that 'goal reframing' language is the primary trigger for exploitation, rather than broad adversari…
Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more
The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these honeypots provide substantially longer and harder-to-detect…
Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more
The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these systems provide substantially longer and harder-to-detect i…
The paper empirically evaluates various agentic architectures for offensive security tasks, finding that while broader coordination improves coverage, the optimal architecture is non-monotonic and dep…
The paper argues that the near-term impact of LLM-assisted vulnerability discovery is not simply an increase in zero-day volume, but a critical bottleneck in defender remediation throughput, shifting…
The paper introduces ExploitBench, a capability-graded benchmark that measures the progressive stages of exploitation, demonstrating that while current frontier models can easily trigger bugs, achievi…
Yuhang Wang, Haichang Gao, Zhenxing Niu, Zhaoxiang Liu +3 more
The paper systematically evaluates six OpenClaw-series AI agent frameworks, demonstrating that these agentized systems possess significant security vulnerabilities that are distinct from and more seve…
ZERO-APT introduces a novel closed-loop adversarial framework for automated penetration testing that simulates attacks against an intelligent, real-time defending system, achieving a high attack succe…
Jianan Ma, Xiaohu Du, Ruixiao Lin, Yaoxiang Bian +7 more
The paper introduces a multi-dimensional evasion framework and a new benchmark (A3S-Bench) to test autonomous agents, demonstrating that stateful, multi-turn attacks significantly increase system risk…
The paper establishes a standardized security assessment framework and develops a multi-layered defensive system, demonstrating that systematic testing and external defenses are crucial for safe LLM d…
The paper proposes a unified closed-loop threat taxonomy to systematically analyze and defend foundation models by explicitly framing the bidirectional security interactions between data and models.
Automation-Exploit is a multi-agent LLM framework that enables adaptive offensive security by using a digital twin to safely test and execute high-risk memory-corruption exploits on live targets.
This study empirically measures the consistency and success rate of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation capabilit…
This study empirically measures the consistency and effectiveness of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation rates am…
The paper proposes a general-purpose pipeline to train automated red teaming models capable of generating attacks for arbitrary adversarial goals, overcoming the limitations of current methods that ar…
The paper introduces uGen, the first LLM-driven framework that uses a retrieval-augmented, multi-agent design to automatically generate functionally correct microarchitectural attack Proof-of-Concepts…
The paper benchmarks current frontier computer-using agents against hand-crafted attacks, finding that while they are highly safe in browser tasks, this safety does not generalize to other domains lik…
Hwiwon Lee, Jiawei Liu, Dongjun Kim, Ziqi Zhang +2 more
The paper introduces SEC-bench Pro, a rigorous benchmark for evaluating LLM-based bug hunting on complex software, finding that even advanced agents struggle with long-horizon security tasks.
Houjun Liu, Lisa Einstein, John Yang, Joachim Baumann +4 more
SecureForge is an automated pipeline that significantly reduces cybersecurity vulnerabilities in LLM-generated code by optimizing system prompts, achieving up to a 48% reduction in output vulnerabilit…
Jiaren Peng, Zeqin Li, Chang You, Yan Wang +16 more
This paper provides the first comprehensive systematization and large-scale empirical evaluation of existing LLM-based Automated Penetration Testing (AutoPT) frameworks, offering a structured taxonomy…