~ similar to 2604.26495v2· 20 results
The paper introduces an efficient, lightweight LLM framework for smart contract auditing that decouples the audit process into multiple components, achieving high accuracy while significantly reducing…
Zijun Feng, Yuming Feng, Yu Wang, Weizhe Zhang +3 more
GoAT-X introduces a novel framework that structures cross-chain smart contract auditing as a Graph of Auditing Thoughts, significantly improving the detection of complex, semantic vulnerabilities in m…
QCIVET introduces a novel contract-based framework to ensure the integrity of hybrid quantum-classical pipelines by verifying both the structure (syntactic) and the behavior (semantic) of quantum stag…
The paper introduces TRAILS~, a novel method that improves code correctness validation by grounding LLM reasoning in concrete (input, output) pairs derived from specifications, achieving state-of-the-…
This study formally verified 3,500 AI-generated code artifacts and found that a majority (55.8%) contain exploitable security vulnerabilities, regardless of the LLM used.
Xiang Liu, Sa Song, Zhaowei Zhang, Huiying Lan +5 more
The paper introduces Agora, a domain-aware multi-agent framework that successfully detects deep, previously unknown logic bugs in complex consensus protocols, outperforming existing LLM-based analysis…
The paper proposes Agentic Witnessing, a TEE-enabled framework that allows external verifiers to audit the qualitative properties of private datasets by querying an LLM-based auditor without accessing…
Dalila Ressi, Alvise Spanò, Matteo Rizzo, Lorenzo Benetollo +1 more
This paper evaluates modern reentrancy detection tools, finding that leading LLMs significantly outperform most existing static analyzers and ML models on both real-world and handcrafted benchmarks.
The paper introduces SCDBench, a comprehensive benchmark dataset and methodology that rigorously evaluates LLM-based smart contract decompilers, finding that while frontier models can produce compilab…
The paper introduces SCDBench, a comprehensive benchmark dataset and methodology that rigorously evaluates LLM-based smart contract decompilers, finding that while frontier LLMs can generate compilabl…
Yuwei Liu, Xinyi Wan, Yanhao Wang, Minghua Wang +2 more
KVerus is a retrieval-augmented system that significantly improves the scalability and resilience of formal verification for Rust code by managing complex cross-module dependencies and adapting to cod…
Hongbo Wen, Ying Li, Hanzhi Liu, Chaofan Shou +3 more
Semia is a novel static auditor that translates complex, prose-defined agent skills into a verifiable Datalog fact base, enabling the detection of critical security vulnerabilities in real-world LLM a…
The paper proposes a trust-boundary architecture using Lean 4 to verify the deterministic structured computations surrounding LLM pipelines, providing verifiable certificates for high-stakes deploymen…
Krishiv Agarwal, Ramneet Kaur, Colin Samplawski, Manoj Acharya +5 more
The paper conducts an interpretability-driven safety audit of eight state-of-the-art LLMs, demonstrating that while interpretability-based steering is a powerful auditing tool, model robustness varies…
The paper introduces FVSpec, a large-scale benchmark that translates thousands of real-world Python property-based tests into formal Lean 4 specifications to evaluate AI models for formal software ver…
Tingda Shen, Yebo Feng, Konglin Zhu, Xiaojun Jia +2 more
The paper introduces SIGIL, a novel framework that cryptographically seals the entire lifecycle of LLM skills, ensuring verifiable integrity from publication through runtime execution to prevent suppl…
The paper introduces PSR extsuperscript{2}, a novel static analysis framework that significantly improves the detection of atomicity violations in smart contracts by combining structural path searchin…
The paper introduces MOSAIC-Bench, a benchmark demonstrating that coding agents can ship exploitable code by complying with seemingly innocuous, staged tasks, a vulnerability that is not easily mitiga…
The paper introduces a novel multi-LLM orchestration system combined with symbolic execution to successfully detect memory vulnerabilities in uncompilable, incomplete Rust CVE code snippets, achieving…
The paper proposes a tamper-proof fraud detection system that uses blockchain smart contracts to immutably record ML predictions and workflow executions, addressing the vulnerability of controllable a…