Papers similar to 2605.10074v1

~ similar to 2605.10074v1· 20 results

cs.CRcs.SERecentMay 20, 2026

FuzzingBrain V2: A Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction

Ze Sheng, Zhicheng Chen, Qingxiao Xu, Kewen Zhu +1 more

FuzzingBrain V2 is a multi-agent LLM system that significantly improves automated vulnerability discovery by ensuring all reported bugs are fuzzer-reproducible and handling complex cross-function depe…

View →

cs.SEcs.CRRecentMay 14, 2026

FuzzAgent: Multi-Agent System for Evolutionary Library Fuzzing

Yunlong Lyu, Peng Chen, Fengyi Wu, Junzhe Yu +2 more

FuzzAgent introduces a multi-agent, evolutionary system that significantly improves library fuzzing by iteratively refining the test suite based on runtime feedback, achieving superior coverage and bu…

View →

cs.CRcs.CLRecentMay 4, 2026

FunFuzz: An LLM-Powered Evolutionary Fuzzing Framework

Mario Rodríguez Béjar, B. Romera-Paredes, Jose L. Hernández-Ramos

FunFuzz introduces a multi-island evolutionary fuzzing framework that uses LLMs to generate structured inputs, achieving superior compiler coverage and discovering more unique failures compared to exi…

View →

cs.CRcs.AIRecentMay 13, 2026

No Attack Required: Semantic Fuzzing for Specification Violations in Agent Skills

Ying Li, Hongbo Wen, Yanju Chen, Hanzhi Liu +2 more

The paper introduces Sefz, a semantic fuzzing framework that automatically discovers specification violations in LLM agent skills, finding a significant number of previously unknown exploitable guardr…

View →

cs.CRcs.SERecentApr 5, 2026

Triggering and Detecting Exploitable Library Vulnerability from the Client by Directed Greybox Fuzzing

Yukai Zhao, Menghan Wu, Xing Hu, Shaohua Wang +2 more

The paper proposes LiveFuzz, a directed greybox fuzzing technique that detects the exploitability of third-party library vulnerabilities from client programs without requiring pre-existing proof-of-co…

View →

cs.SEcs.AIcs.CLRecentApr 13, 2026

AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection

Zijie Zhao, Chenyuan Yang, Weidong Wang, Yihan Yang +2 more

AnyPoC introduces a general multi-agent framework that reliably generates and validates executable Proof-of-Concept (PoC) tests from candidate bug reports, significantly improving automated bug detect…

View →

cs.CRRecentMar 19, 2026

Weaver: Fuzzing JavaScript Engines at the JavaScript-WebAssembly Boundary

Lingming Zhang, Binbin Zhao, Puzhuo Liu, Qinge Xie +3 more

Weaver is a novel greybox fuzzing framework designed to uncover security vulnerabilities at the complex interaction boundary between JavaScript and WebAssembly, achieving superior code coverage and fi…

View →

cs.CRcs.PLRecentApr 20, 2026

SDLLMFuzz: Dynamic-static LLM-assisted greybox fuzzing for structured input programs

Yihao Zou, Tianming Zheng, Futai Zou, Yue Wu

SDLLMFuzz is a novel dynamic-static framework that combines LLM-based structure-aware input generation with semantic feedback from crash analysis to significantly improve vulnerability discovery in st…

View →

cs.CRcs.LGRecentMay 6, 2026

Agentic Vulnerability Reasoning on Windows COM Binaries

Hwiwon Lee, Jongseong Kim, Lingming Zhang

The paper introduces SLYP, an agentic pipeline that significantly improves the discovery of race condition vulnerabilities in Windows COM binaries and autonomously generates verified proof-of-concept…

View →

cs.SEcs.AIcs.CLRecentMay 17, 2026

ContraFix: Agentic Vulnerability Repair via Differential Runtime Evidence and Skill Reuse

Simiao Liu, Fang Liu, Li Zhang, Yang Liu +1 more

ContraFix is an agentic framework that improves automated vulnerability repair by using differential runtime evidence to pinpoint the root cause of bugs, achieving state-of-the-art performance on majo…

View →

cs.CRcs.LGRecentMay 26, 2026

SEC-bench Pro: Can Language Models Solve Long-Horizon Software Security Tasks?

Hwiwon Lee, Jiawei Liu, Dongjun Kim, Ziqi Zhang +2 more

The paper introduces SEC-bench Pro, a rigorous benchmark for evaluating LLM-based bug hunting on complex software, finding that even advanced agents struggle with long-horizon security tasks.

View →

cs.CRcs.AIcs.SERecentApr 7, 2026

Broken by Default: A Formal Verification Study of Security Vulnerabilities in AI-Generated Code

Dominik Blain, Maxime Noiseux

This study formally verified 3,500 AI-generated code artifacts and found that a majority (55.8%) contain exploitable security vulnerabilities, regardless of the LLM used.

View →

cs.CRRecentJun 1, 2026

PeAR: A Static Binary Rewriting Framework for Binary-Only Fuzzing

Alvin Charles, Adrian Herrera, Peter Oslington, Alwen Tiu

The paper introduces PeAR, a static binary rewriting framework that proves static binary instrumentation (SBI) is a practical and effective alternative to dynamic binary instrumentation (DBI) for high…

View →

cs.CRcs.SERecentMay 20, 2026

Quality-Assured Fuzz Harness Generation via the Four Principles Framework

Ze Sheng, Dmitrijs Trizna, Luigino Camastra, Zhicheng Chen +2 more

The paper introduces QuartetFuzz, an autonomous system that systematically ensures the correctness of fuzzing harnesses using a novel Four Principles framework, significantly improving vulnerability d…

View →

cs.CRcs.PLRecentMay 12, 2026

OverrideFuzz: Semantic-Aware Grammar Fuzzing for Script-Runtime Vulnerabilities

Yiran Qiu

OverrideFuzz is a novel semantic-aware grammar fuzzer designed to test script-language runtimes by specifically modeling and exploiting complex behaviors like method overriding and dynamic rebinding,…

View →

cs.SEcs.CRRecentMay 25, 2026

FuzzPilot: Plateau-Triggered Recipe Validation for Structured Text Fuzzing

Zhiyi Yao

FuzzPilot is a controller for AFL++ that validates candidate mutation recipes by running short micro-campaigns, demonstrating a mechanism to manage fuzzing plateaus, though initial results on a satura…

View →

cs.CRcs.AIRecentMay 7, 2026

Patch2Vuln: Agentic Reconstruction of Vulnerabilities from Linux Distribution Binary Patches

Isaac David, Arthur Gervais

The paper introduces Patch2Vuln, a pipeline that uses an LLM agent to reconstruct security vulnerabilities by analyzing differences between old and new Linux binary packages, successfully localizing p…

View →

cs.SEcs.AIcs.CRRecentApr 12, 2026

Verify Before You Fix: Agentic Execution Grounding for Trustworthy Cross-Language Code Analysis

Jugal Gajjar

The paper introduces an execution-grounded, cross-language framework that significantly improves the reliability of LLM-driven code vulnerability analysis by ensuring that all proposed fixes are confi…

View →

cs.CRRecentMay 8, 2026

Demystifying and Detecting Agentic Workflow Injection Vulnerabilities in GitHub Actions

Shenao Wang, Xinyi Hou, Zhao Liu, Yanjie Zhao +4 more

This paper introduces Agentic Workflow Injection (AWI), a new class of vulnerability in LLM-powered GitHub Actions, and presents TaintAWI, a novel taint-analysis tool that identifies hundreds of explo…

View →

cs.CRRecentApr 22, 2026

Synthesizing Multi-Agent Harnesses for Vulnerability Discovery

Hanzhi Liu, Chaofan Shou, Xiaonan Liu, Hongbo Wen +3 more

The paper introduces AgentFlow, a novel framework that uses a typed graph DSL and feedback-driven optimization to automatically synthesize and improve multi-agent harnesses for discovering security vu…

View →