~ similar to 2605.02868v1· 20 results
Ziqiao Kong, Wanxu Xia, Chong Wang, Yi Lu +4 more
Knowdit is a knowledge-driven, agentic framework that significantly improves smart contract vulnerability detection by modeling shared DeFi semantics and leveraging historical audit knowledge.
The paper introduces Phoenix, a training-free multi-agent framework that detects code vulnerabilities by synthesizing project-specific behavioral contracts, significantly outperforming existing method…
Bowen Cai, Weiheng Bai, Youshui Lu, Haoran Xu +3 more
GenDetect introduces a novel framework to rapidly generalize detection rules from single observed DeFi exploits, significantly improving resilience against subsequent, similar 'Imitative Attack Cascad…
The paper introduces ExploitBench, a capability-graded benchmark that measures the progressive stages of exploitation, demonstrating that while current frontier models can easily trigger bugs, achievi…
Bowen Cai, Weiheng Bai, Hangyun Tang, Youshui Lu +1 more
The paper introduces FAUDITOR, a specialized, self-learning fuzzer that detects complex Monetarily Exploitable Vulnerabilities (MEVuls) in smart contracts by integrating NLP-processed auditor knowledg…
Hwiwon Lee, Jiawei Liu, Dongjun Kim, Ziqi Zhang +2 more
The paper introduces SEC-bench Pro, a rigorous benchmark for evaluating LLM-based bug hunting on complex software, finding that even advanced agents struggle with long-horizon security tasks.
This paper outlines a comprehensive research framework for smart contract security, moving beyond simple vulnerability detection to encompass advanced areas like semantic reasoning, automated repair,…
FORGE is a multi-agent system that integrates vulnerability exploitation, prioritization, and detection engineering into a single pipeline, achieving high-fidelity, multi-level exploitation and genera…
PoC-Adapt is an end-to-end framework that significantly improves the reliability and efficiency of automated vulnerability exploitation by integrating semantic state validation and reinforcement learn…
The paper introduces MOSAIC-Bench, a benchmark demonstrating that coding agents can ship exploitable code by complying with seemingly innocuous, staged tasks, a vulnerability that is not easily mitiga…
This paper benchmarks LLMs for smart contract security analysis, concluding that while LLMs show potential, their reliability is limited by lexical bias and requires integration with traditional stati…
The paper analyzes the nascent DeFi investment agent market, finding that while token valuations are high, current deployments are heterogeneous, lack clear autonomous execution, and exhibit poor risk…
The paper empirically analyzes the nascent DeFi investment agent market, finding that while token valuations are high, current deployments lack robust autonomous execution and exhibit poor risk-adjust…
ContractShield is a robust multimodal framework that uses a novel three-level fusion mechanism to accurately detect multiple types of vulnerabilities in obfuscated smart contracts, significantly outpe…
Zhun Wang, Nico Schiller, Hongwei Li, Srijiith Sesha Narayana +12 more
The paper introduces ExploitGym, a large-scale benchmark, demonstrating that advanced AI agents can successfully turn theoretical software vulnerabilities into working exploits, highlighting growing c…
SAILOR automates the construction of symbolic execution harnesses by combining static analysis and LLM-based synthesis, significantly improving the scalability and effectiveness of vulnerability disco…
Simiao Liu, Fang Liu, Li Zhang, Yang Liu +1 more
ContraFix is an agentic framework that improves automated vulnerability repair by using differential runtime evidence to pinpoint the root cause of bugs, achieving state-of-the-art performance on majo…
Pengyu Sun, Qishu Jin, Enhao Huang, Zifeng Kang +3 more
VIPER-MCP is a novel, end-to-end automated framework that detects and dynamically confirms the exploitability of taint-style vulnerabilities in Model Context Protocol (MCP) servers, achieving high-fid…
The paper systematically maps LLM agent vulnerabilities by testing 10,000 prompt variations, finding that 'goal reframing' language is the primary trigger for exploitation, rather than broad adversari…
AttackPathGNN proposes a novel graph neural network approach to detect smart contract vulnerabilities by modeling explicit attack paths and function interactions, achieving high detection rates on sta…