Papers similar to 2605.02868v1

~ similar to 2605.02868v1· 20 results

cs.CRcs.AIcs.SERecentMar 27, 2026

Knowdit: Agentic Smart Contract Vulnerability Detection with Auditing Knowledge Summarization

Ziqiao Kong, Wanxu Xia, Chong Wang, Yi Lu +4 more

Knowdit is a knowledge-driven, agentic framework that significantly improves smart contract vulnerability detection by modeling shared DeFi semantics and leveraging historical audit knowledge.

View →

cs.CRcs.SERecentApr 21, 2026

Security Is Relative: Training-Free Vulnerability Detection via Multi-Agent Behavioral Contract Synthesis

Yongchao Wang, Zhiqiu Huang

The paper introduces Phoenix, a training-free multi-agent framework that detects code vulnerabilities by synthesizing project-specific behavioral contracts, significantly outperforming existing method…

View →

cs.CRcs.SERecentApr 28, 2026

GenDetect: Generalizing Reactive Detection for Resilience Against Imitative DeFi Attack Cascade

Bowen Cai, Weiheng Bai, Youshui Lu, Haoran Xu +3 more

GenDetect introduces a novel framework to rapidly generalize detection rules from single observed DeFi exploits, significantly improving resilience against subsequent, similar 'Imitative Attack Cascad…

View →

cs.CRcs.AIRecentMay 13, 2026

ExploitBench: A Capability Ladder Benchmark for LLM Cybersecurity Agents

Seunghyun Lee, David Brumley

The paper introduces ExploitBench, a capability-graded benchmark that measures the progressive stages of exploitation, demonstrating that while current frontier models can easily trigger bugs, achievi…

View →

cs.CRRecentApr 20, 2026

Capturing Monetarily Exploitable Vulnerability in Smart Contracts via Auditor Knowledge-Learning Fuzzing

Bowen Cai, Weiheng Bai, Hangyun Tang, Youshui Lu +1 more

The paper introduces FAUDITOR, a specialized, self-learning fuzzer that detects complex Monetarily Exploitable Vulnerabilities (MEVuls) in smart contracts by integrating NLP-processed auditor knowledg…

View →

cs.CRcs.LGRecentMay 26, 2026

SEC-bench Pro: Can Language Models Solve Long-Horizon Software Security Tasks?

Hwiwon Lee, Jiawei Liu, Dongjun Kim, Ziqi Zhang +2 more

The paper introduces SEC-bench Pro, a rigorous benchmark for evaluating LLM-based bug hunting on complex software, finding that even advanced agents struggle with long-horizon security tasks.

View →

cs.CRRecentMay 9, 2026

Smart Contract Security Beyond Detection

Tamer Abdelaziz

This paper outlines a comprehensive research framework for smart contract security, moving beyond simple vulnerability detection to encompass advanced areas like semantic reasoning, automated repair,…

View →

cs.CRcs.AIcs.MARecentJun 2, 2026

FORGE: Multi-Agent Graduated Exploitation and Detection Engineering

Farooq Shaikh

FORGE is a multi-agent system that integrates vulnerability exploitation, prioritization, and detection engineering into a single pipeline, achieving high-fidelity, multi-level exploitation and genera…

View →

cs.CRRecentApr 8, 2026

PoC-Adapt: Semantic-Aware Automated Vulnerability Reproduction with LLM Multi-Agents and Reinforcement Learning-Driven Adaptive Policy

Phan The Duy, Khoa Ngo-Khanh, Nguyen Huu Quyen, Van-Hau Pham

PoC-Adapt is an end-to-end framework that significantly improves the reliability and efficiency of automated vulnerability exploitation by integrating semantic state validation and reinforcement learn…

View →

cs.CRcs.AIcs.SERecentMay 5, 2026

MOSAIC-Bench: Measuring Compositional Vulnerability Induction in Coding Agents

Jonathan Steinberg, Oren Gal

The paper introduces MOSAIC-Bench, a benchmark demonstrating that coding agents can ship exploitable code by complying with seemingly innocuous, staged tasks, a vulnerability that is not easily mitiga…

View →

cs.CRcs.AIRecentMay 11, 2026

Benchmarking LLM-Based Static Analysis for Secure Smart Contract Development: Reliability, Limitations, and Potential Hybrid Solutions

Stefan-Claudiu Susan, Andrei Arusoaie, Dorel Lucanu

This paper benchmarks LLMs for smart contract security analysis, concluding that while LLMs show potential, their reliability is limited by lexical bias and requires integration with traditional stati…

View →

cs.AIcs.CRRecentMay 27, 2026

Paper Agents, Paper Gains: An Empirical Analysis of DeFi Investment Agents

Jay Yu, Amy Zhao, Danning Sui

The paper analyzes the nascent DeFi investment agent market, finding that while token valuations are high, current deployments are heterogeneous, lack clear autonomous execution, and exhibit poor risk…

View →

cs.AIcs.CRRecentMay 27, 2026

Paper Agents, Paper Gains: An Empirical Analysis of DeFi Investment Agents

Jay Yu, Amy Zhao, Danning Sui

The paper empirically analyzes the nascent DeFi investment agent market, finding that while token valuations are high, current deployments lack robust autonomous execution and exhibit poor risk-adjust…

View →

cs.CRRecentApr 3, 2026

ContractShield: Bridging Semantic-Structural Gaps via Hierarchical Cross-Modal Fusion for Multi-Label Vulnerability Detection in Obfuscated Smart Contracts

Minh-Dai Tran-Duong, Nguyen Hai Phong, Nguyen Chi Thanh, Doan Minh Trung +3 more

ContractShield is a robust multimodal framework that uses a novel three-level fusion mechanism to accurately detect multiple types of vulnerabilities in obfuscated smart contracts, significantly outpe…

View →

cs.CRcs.AIcs.LGRecentMay 11, 2026

ExploitGym: Can AI Agents Turn Security Vulnerabilities into Real Attacks?

Zhun Wang, Nico Schiller, Hongwei Li, Srijiith Sesha Narayana +12 more

The paper introduces ExploitGym, a large-scale benchmark, demonstrating that advanced AI agents can successfully turn theoretical software vulnerabilities into working exploits, highlighting growing c…

View →

cs.CRcs.SERecentApr 7, 2026

Guiding Symbolic Execution with Static Analysis and LLMs for Vulnerability Discovery

Md Shafiuzzaman, Achintya Desai, Wenbo Guo, Tevfik Bultan

SAILOR automates the construction of symbolic execution harnesses by combining static analysis and LLM-based synthesis, significantly improving the scalability and effectiveness of vulnerability disco…

View →

cs.SEcs.AIcs.CLRecentMay 17, 2026

ContraFix: Agentic Vulnerability Repair via Differential Runtime Evidence and Skill Reuse

Simiao Liu, Fang Liu, Li Zhang, Yang Liu +1 more

ContraFix is an agentic framework that improves automated vulnerability repair by using differential runtime evidence to pinpoint the root cause of bugs, achieving state-of-the-art performance on majo…

View →

cs.CRRecentMay 20, 2026

VIPER-MCP: Detecting and Exploiting Taint-Style Vulnerabilities in Model Context Protocol Servers

Pengyu Sun, Qishu Jin, Enhao Huang, Zifeng Kang +3 more

VIPER-MCP is a novel, end-to-end automated framework that detects and dynamically confirms the exploitability of taint-style vulnerabilities in Model Context Protocol (MCP) servers, achieving high-fid…

View →

cs.CRcs.AIcs.CLRecentApr 6, 2026

Mapping the Exploitation Surface: A 10,000-Trial Taxonomy of What Makes LLM Agents Exploit Vulnerabilities

Charafeddine Mouzouni

The paper systematically maps LLM agent vulnerabilities by testing 10,000 prompt variations, finding that 'goal reframing' language is the primary trigger for exploitation, rather than broad adversari…

View →

cs.CRcs.AIRecentJun 4, 2026

AttackPathGNN: Cross-function vulnerability detection in smart contracts using state interference graphs and conjunction pooling

Gabriela Dobrita, Simona-Vasilica Oprea, Adela Bara

AttackPathGNN proposes a novel graph neural network approach to detect smart contract vulnerabilities by modeling explicit attack paths and function interactions, achieving high detection rates on sta…

View →