Testing
Fuzzing, test generation, verification, and software quality
20 papers indexed
PickleFuzzer: A Case Study in Fuzzing for Discrepancies Between Python Pickle Implementations
The paper introduces PickleFuzzer, a custom fuzzer that identifies security-critical discrepancies across different Python pickle implementations, finding 14 new bugs including four that could bypass…
Adaptive and AI-Augmented Security Testing: A Systematic Survey of Program Analysis, Feedback-Driven Testing, and Hybrid Learning-Based Approaches
This paper systematically surveys adaptive and AI-augmented security testing, concluding that a major gap exists—structural-adaptive fragmentation—where current systems fail to integrate structural pr…
Position: AI Security Policy Should Target Systems, Not Models
The paper demonstrates that advanced capabilities, such as jailbreaking large language models and finding software vulnerabilities, can be achieved effectively at zero cost by coordinating multiple sm…
FuzzingBrain V2: A Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction
Ze Sheng, Zhicheng Chen, Qingxiao Xu, Kewen Zhu +1 more
FuzzingBrain V2 is a multi-agent LLM system that significantly improves automated vulnerability discovery by ensuring all reported bugs are fuzzer-reproducible and handling complex cross-function depe…
Quality-Assured Fuzz Harness Generation via the Four Principles Framework
Ze Sheng, Dmitrijs Trizna, Luigino Camastra, Zhicheng Chen +2 more
The paper introduces QuartetFuzz, an autonomous system that systematically ensures the correctness of fuzzing harnesses using a novel Four Principles framework, significantly improving vulnerability d…
SDLLMFuzz: Dynamic-static LLM-assisted greybox fuzzing for structured input programs
SDLLMFuzz is a novel dynamic-static framework that combines LLM-based structure-aware input generation with semantic feedback from crash analysis to significantly improve vulnerability discovery in st…
From CRUD to Autonomous Agents: Formal Validation and Zero-Trust Security for Semantic Gateways in AI-Native Enterprise Systems
The paper proposes a Semantic Gateway and a Zero-Trust security model to formally validate and secure autonomous AI agents operating in enterprise systems, achieving a 100% discovery rate of unauthori…
Before the Model Learns the Bug:Fuzzing RLVR Verifiers
The paper introduces a verifier-fuzzing framework to detect and analyze failure modes in Reinforcement Learning with Verifiable Rewards (RLVR) where bugs in the reward verifier can be exploited by the…
Agentic Fuzzing: Opportunities and Challenges
The paper proposes agentic fuzzing, a novel bug-finding approach where deep agents perform direct reasoning based on historical bugs to discover logic bugs in mature codebases.
SEC-bench Pro: Can Language Models Solve Long-Horizon Software Security Tasks?
Hwiwon Lee, Jiawei Liu, Dongjun Kim, Ziqi Zhang +2 more
The paper introduces SEC-bench Pro, a rigorous benchmark for evaluating LLM-based bug hunting on complex software, finding that even advanced agents struggle with long-horizon security tasks.
AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection
Zijie Zhao, Chenyuan Yang, Weidong Wang, Yihan Yang +2 more
AnyPoC introduces a general multi-agent framework that reliably generates and validates executable Proof-of-Concept (PoC) tests from candidate bug reports, significantly improving automated bug detect…
Finding Memory Leaks in C/C++ Programs via Neuro-Symbolic Augmented Static Analysis
Huihui Huang, Jieke Shi, Bo Wang, Zhou Yang +1 more
MemHint is a neuro-symbolic static analysis pipeline that significantly improves memory leak detection in C/C++ by combining LLM semantic understanding with Z3 symbolic reasoning, detecting more leaks…
Stop Starving or Stuffing Me: Boosting Firmware Fuzzing Efficiency with On-demand Input Delivery
Shandian Shen, Wei Zhou, Keming Zhao, Peng Liu +2 more
The paper introduces FIDO, a novel framework that significantly boosts firmware fuzzing efficiency by accurately managing the timing and quantity of input delivery based on the firmware's internal inp…
Obfuscating Code Vulnerabilities against Static Analysis in JavaScript Code
This paper empirically demonstrates that current Static Application Security Testing (SAST) tools are fundamentally unreliable against common JavaScript obfuscation techniques, showing that obfuscatio…
VCAO: Verifier-Centered Agentic Orchestration for Strategic OS Vulnerability Discovery
The paper introduces VCAO, a novel verifier-centered agentic orchestration framework that models OS vulnerability discovery as a Bayesian Stackelberg game, significantly improving vulnerability discov…
PeAR: A Static Binary Rewriting Framework for Binary-Only Fuzzing
The paper introduces PeAR, a static binary rewriting framework that proves static binary instrumentation (SBI) is a practical and effective alternative to dynamic binary instrumentation (DBI) for high…
VulnScout-C: A Lightweight Transformer for C Code Vulnerability Detection
Aymen Lassoued, Nacef Mbarek, Bechir Dardouri, Bassem Ouni +2 more
The paper introduces VULNSCOUT-C, a compact, specialized transformer model that achieves state-of-the-art performance in C code vulnerability detection while maintaining low inference cost, making it…
Agora: Toward Autonomous Bug Detection in Production-Level Consensus Protocols with LLM Agents
Xiang Liu, Sa Song, Zhaowei Zhang, Huiying Lan +5 more
The paper introduces Agora, a domain-aware multi-agent framework that successfully detects deep, previously unknown logic bugs in complex consensus protocols, outperforming existing LLM-based analysis…
Strategic Heterogeneous Multi-Agent Architecture for Cost-Effective Code Vulnerability Detection
The paper proposes a novel '3+1' heterogeneous multi-agent architecture using cloud LLMs and a local verifier to achieve high-accuracy, cost-effective code vulnerability detection, significantly outpe…
Security Concerns in Generative AI Coding Assistants: Insights from Online Discussions on GitHub Copilot
This paper analyzes online developer discussions to identify four major security concerns—data leakage, code licensing, adversarial attacks, and insecure suggestions—associated with using generative A…