Papers similar to 2604.12172v1

~ similar to 2604.12172v1· 19 results

cs.CRcs.AIcs.SERecentApr 7, 2026

Broken by Default: A Formal Verification Study of Security Vulnerabilities in AI-Generated Code

This study formally verified 3,500 AI-generated code artifacts and found that a majority (55.8%) contain exploitable security vulnerabilities, regardless of the LLM used.

View →

cs.CRcs.LGRecentMay 28, 2026

Dissecting the Black Box: Circuit-Level Analysis of LLM Vulnerability Detection

Syafiq Al Atiiq, Chun Zhou, Christian Gehrmann

The paper analyzes LLM vulnerability detection using mechanistic interpretability, finding that models primarily rely on safety detectors rather than direct vulnerability signature recognition.

View →

cs.CRcs.SERecentApr 7, 2026

Guiding Symbolic Execution with Static Analysis and LLMs for Vulnerability Discovery

Md Shafiuzzaman, Achintya Desai, Wenbo Guo, Tevfik Bultan

SAILOR automates the construction of symbolic execution harnesses by combining static analysis and LLM-based synthesis, significantly improving the scalability and effectiveness of vulnerability disco…

View →

cs.CRcs.ARcs.LGRecentMay 11, 2026

LLMs for Secure Hardware Design and Related Problems: Opportunities and Challenges

Johann Knechtel, Ozgur Sinanoglu, Ramesh Karri

This review analyzes the dual impact of integrating Large Language Models (LLMs) into hardware design, detailing both their transformative potential in EDA and the critical security vulnerabilities th…

View →

cs.CRRecentApr 27, 2026

GoAT-X: A Graph of Auditing Thoughts for Securing Token Transactions in Cross-Chain Contracts

Zijun Feng, Yuming Feng, Yu Wang, Weizhe Zhang +3 more

GoAT-X introduces a novel framework that structures cross-chain smart contract auditing as a Graph of Auditing Thoughts, significantly improving the detection of complex, semantic vulnerabilities in m…

View →

cs.CRcs.PLcs.SERecentApr 28, 2026

Symbolic Execution Meets Multi-LLM Orchestration: Detecting Memory Vulnerabilities in Incomplete Rust CVE Snippets

Zeyad Abdelrazek, Young Lee

The paper introduces a novel multi-LLM orchestration system combined with symbolic execution to successfully detect memory vulnerabilities in uncompilable, incomplete Rust CVE code snippets, achieving…

View →

cs.CRRecentApr 2, 2026

AI-Assisted Hardware Security Verification: A Survey and AI Accelerator Case Study

Khan Thamid Hasan, Md Ajoad Hasan, Nashmin Alam, Md. Touhidul Islam +2 more

This survey reviews the integration of AI and LLMs into hardware security verification, demonstrating its potential to automate complex stages while stressing the necessity of grounding AI outputs in…

View →

cs.SEcs.CRRecentMay 21, 2026

Finding Missing Input Validation in TEEs via LLM-Assisted Symbolic Execution

Chengyan Ma, Jieke Shi, Ruidong Han, Ye Liu +2 more

The paper introduces SymTEE, an LLM-assisted symbolic execution framework that detects missing input validation vulnerabilities in TEE applications without needing complex, real TEE setups.

View →

cs.CRRecentMay 30, 2026

NeuroLog: Reasoning You Can Audit -- Neuro-Symbolic Vulnerability Discovery via LLM Facts, Datalog, and SMT

Sanjay Rawat

NeuroLog is a novel, build-free neuro-symbolic pipeline that combines LLM-derived dataflow facts, Datalog, and SMT solving to systematically discover and synthesize exploitable memory safety vulnerabi…

View →

cs.SEcs.CRRecentMay 5, 2026

KVerus: Scalable and Resilient Formal Verification Proof Generation for Rust Code

Yuwei Liu, Xinyi Wan, Yanhao Wang, Minghua Wang +2 more

KVerus is a retrieval-augmented system that significantly improves the scalability and resilience of formal verification for Rust code by managing complex cross-module dependencies and adapting to cod…

View →

cs.CRcs.LGRecentApr 22, 2026

Breaking Bad: Interpretability-Based Safety Audits of State-of-the-Art LLMs

Krishiv Agarwal, Ramneet Kaur, Colin Samplawski, Manoj Acharya +5 more

The paper conducts an interpretability-driven safety audit of eight state-of-the-art LLMs, demonstrating that while interpretability-based steering is a powerful auditing tool, model robustness varies…

View →

cs.CRcs.LGRecentApr 24, 2026

Sovereign Agentic Loops: Decoupling AI Reasoning from Execution in Real-World Systems

Jun He, Deying Yu

The paper introduces Sovereign Agentic Loops (SAL), a control-plane architecture that decouples LLM reasoning from system execution to enhance safety and reliability in real-world AI agents.

View →

cs.SEcs.CRRecentMay 11, 2026

AutoSOUP: Safety-Oriented Unit Proof Generation for Component-level Memory-Safety Verification

Paschal C. Amusuo, Ricardo Calvo, Dharun Anandayuvaraj, Taylor Le Lievre +4 more

AutoSOUP is a system that automates component-level memory-safety verification by generating Safety-Oriented Unit Proofs, leveraging a hybrid LLM-based architecture to overcome manual workflow limitat…

View →

cs.CRcs.AIRecentJun 2, 2026

Learn from Your Mistakes: Tree-like Self-Play for Secure Code LLMs

Wenqi Chen, Ziyan Zhang, Bing Wang, Lin Liu +2 more

The paper introduces Tree-like Self-Play (TSP), a novel framework that treats secure code generation as a fine-grained decision process, significantly improving LLM security by forcing the model to se…

View →

cs.AIcs.CRcs.SERecentMar 19, 2026

Implicit Patterns in LLM-Based Binary Analysis

Qiang Li, XiangRui Zhang, Haining Wang

This paper analyzes large-scale reasoning traces from LLM-based binary vulnerability analysis, identifying four structured, token-level implicit patterns that govern how LLMs explore code paths.

View →

cs.CRRecentMar 30, 2026

VulnScout-C: A Lightweight Transformer for C Code Vulnerability Detection

Aymen Lassoued, Nacef Mbarek, Bechir Dardouri, Bassem Ouni +2 more

The paper introduces VULNSCOUT-C, a compact, specialized transformer model that achieves state-of-the-art performance in C code vulnerability detection while maintaining low inference cost, making it…

View →

cs.SEcs.AIcs.CRRecentApr 12, 2026

Verify Before You Fix: Agentic Execution Grounding for Trustworthy Cross-Language Code Analysis

Jugal Gajjar

The paper introduces an execution-grounded, cross-language framework that significantly improves the reliability of LLM-driven code vulnerability analysis by ensuring that all proposed fixes are confi…

View →

cs.CRcs.AIRecentApr 22, 2026

Mythos and the Unverified Cage: Z3-Based Pre-Deployment Verification for Frontier-Model Sandbox Infrastructure

Dominik Blain

The paper introduces COBALT, a Z3 SMT-based formal verification engine, to proactively detect arithmetic vulnerabilities (CWE-190/191/195) in the critical infrastructure surrounding frontier AI models…

View →

cs.ARcs.AIcs.CRRecentApr 15, 2026

VeriCWEty: Embedding enabled Line-Level CWE Detection in Verilog

Prithwish Basu Roy, Zeng Wang, Anatolii Chuvashlov, Weihua Xiao +3 more

VeriCWEty proposes an embedding-based framework to detect and classify common software vulnerabilities (CWEs) in Verilog RTL code at both module and line levels, achieving high detection accuracy.

View →