Papers similar to 2605.19668v1

~ similar to 2605.19668v1· 20 results

cs.CRcs.PLcs.SERecentApr 28, 2026

Symbolic Execution Meets Multi-LLM Orchestration: Detecting Memory Vulnerabilities in Incomplete Rust CVE Snippets

The paper introduces a novel multi-LLM orchestration system combined with symbolic execution to successfully detect memory vulnerabilities in uncompilable, incomplete Rust CVE code snippets, achieving…

View →

cs.CRcs.AIRecentMay 4, 2026

APIOT: Autonomous Vulnerability Management Across Bare-Metal Industrial OT Networks

Adel ElZemity, Budi Arief, Shujun Li, Calvin Brierley +5 more

The paper introduces APIOT, the first LLM framework capable of autonomously performing the full discovery, exploitation, patching, and verification cycle against bare-metal industrial OT devices.

View →

cs.CRRecentMay 20, 2026

VIPER-MCP: Detecting and Exploiting Taint-Style Vulnerabilities in Model Context Protocol Servers

Pengyu Sun, Qishu Jin, Enhao Huang, Zifeng Kang +3 more

VIPER-MCP is a novel, end-to-end automated framework that detects and dynamically confirms the exploitability of taint-style vulnerabilities in Model Context Protocol (MCP) servers, achieving high-fid…

View →

cs.CRcs.AIRecentApr 14, 2026

LogicEval: A Systematic Framework for Evaluating Automated Repair Techniques for Logical Vulnerabilities in Real-World Software

Syed Md Mukit Rashid, Abdullah Al Ishtiaq, Kai Tu, Yilu Dong +6 more

The paper introduces LogicEval, a systematic framework and dataset (LogicDS) to evaluate automated repair techniques for logical software vulnerabilities, finding that prompt sensitivity and context l…

View →

cs.SEcs.AIcs.CLRecentMay 17, 2026

ContraFix: Agentic Vulnerability Repair via Differential Runtime Evidence and Skill Reuse

Simiao Liu, Fang Liu, Li Zhang, Yang Liu +1 more

ContraFix is an agentic framework that improves automated vulnerability repair by using differential runtime evidence to pinpoint the root cause of bugs, achieving state-of-the-art performance on majo…

View →

cs.SEcs.CRRecentMay 27, 2026

Towards Demystifying and Repairing LLM-in-the-Loop Vulnerabilities

Yujie Ma, Jialin Rong, Chenxi Yang, Lili Quan +3 more

The paper addresses the gap in understanding real-world LLM-in-the-loop vulnerabilities by creating the LLMCVE dataset and demonstrating that these vulnerabilities are significantly harder to repair t…

View →

cs.CRcs.NIcs.SERecentMay 6, 2026

AFL-ICP: Enhancing Industrial Control Protocol Reliability via Specification-Guided Fuzzing

Jiaying Meng, Xuewei Feng, Qi Li, Min Liu +1 more

AFL-ICP is a novel specification-driven fuzzing framework that significantly enhances the security testing of industrial control protocols by detecting subtle semantic and logic bugs missed by traditi…

View →

cs.CRRecentApr 8, 2026

PSR2: A Phase-based Semantic Reasoning Framework for Atomicity Violation Detection via Contract Refinement

Xiaoqi Li, Xin Wang, Wenkai Li, Zongwei Li

The paper introduces PSR extsuperscript{2}, a novel static analysis framework that significantly improves the detection of atomicity violations in smart contracts by combining structural path searchin…

View →

cs.CRcs.AIcs.MARecentApr 20, 2026

RAVEN: Retrieval-Augmented Vulnerability Exploration Network for Memory Corruption Analysis in User Code and Binary Programs

Parteek Jamwal, Minghao Shao, Boyuan Chen, Achyuta Muthuvelan +14 more

The paper introduces RAVEN, a Retrieval-Augmented Vulnerability Exploration Network, which uses LLM agents and RAG to automatically generate comprehensive, structured vulnerability analysis reports fo…

View →

cs.CRcs.SERecentApr 5, 2026

LLM-Enabled Open-Source Systems in the Wild: An Empirical Study of Vulnerabilities in GitHub Security Advisories

Fariha Tanjim Shifat, Hariswar Baburaj, Ce Zhou, Jaydeb Sarker +1 more

The paper analyzes GitHub security advisories for LLM-integrated open-source systems, finding that while most vulnerabilities map to existing code-level weaknesses, the architectural risks like Supply…

View →

cs.CRcs.AIRecentMay 13, 2026

ExploitBench: A Capability Ladder Benchmark for LLM Cybersecurity Agents

Seunghyun Lee, David Brumley

The paper introduces ExploitBench, a capability-graded benchmark that measures the progressive stages of exploitation, demonstrating that while current frontier models can easily trigger bugs, achievi…

View →

cs.CRcs.AIRecentApr 1, 2026

Automated Framework to Evaluate and Harden LLM System Instructions against Encoding Attacks

Anubhab Sahu, Diptisha Samanta, Reza Soosahabi

The paper introduces an automated framework demonstrating that LLM system instructions are vulnerable to encoding attacks, where structured output requests can bypass safety refusals and leak sensitiv…

View →

cs.CRcs.LGRecentMay 6, 2026

Agentic Vulnerability Reasoning on Windows COM Binaries

Hwiwon Lee, Jongseong Kim, Lingming Zhang

The paper introduces SLYP, an agentic pipeline that significantly improves the discovery of race condition vulnerabilities in Windows COM binaries and autonomously generates verified proof-of-concept…

View →

cs.CRRecentApr 2, 2026

Assertain: Automated Security Assertion Generation Using Large Language Models

Shams Tarek, Dipayan Saha, Khan Thamid Hasan, Sujan Kumar Saha +2 more

Assertain is an automated framework that uses large language models and design analysis to generate high-quality, executable security assertions for hardware designs, significantly outperforming state…

View →

cs.CRcs.SCRecentMay 25, 2026

Heimdall: Formally Verified Automated Migration of Legacy eBPF Programs to Rust

Vishnu Asutosh Dasu, Monika Santra, Md Rafi Ur Rashid, Ashish Kumar +2 more

The paper introduces Heimdall, an automated pipeline that uses LLMs and formal verification to safely and automatically migrate legacy, potentially buggy eBPF programs written in C to memory-safe Rust…

View →

cs.CRRecentMay 19, 2026

Hunting Vulnerability Variants in AI Infra: Measurement and Reference-Driven Detection

Tian Dong, Yanjun Chen, Shoufeng Zhang, Huaien Zhang +5 more

This paper measures the prevalence of recurring vulnerability patterns (variants) across multiple AI infrastructure repositories and proposes INFRASCOPE, a framework to automatically detect these vari…

View →

cs.CRcs.AIRecentApr 28, 2026

From CRUD to Autonomous Agents: Formal Validation and Zero-Trust Security for Semantic Gateways in AI-Native Enterprise Systems

Ignacio Peyrano

The paper proposes a Semantic Gateway and a Zero-Trust security model to formally validate and secure autonomous AI agents operating in enterprise systems, achieving a 100% discovery rate of unauthori…

View →

cs.CRcs.SERecentApr 7, 2026

Guiding Symbolic Execution with Static Analysis and LLMs for Vulnerability Discovery

Md Shafiuzzaman, Achintya Desai, Wenbo Guo, Tevfik Bultan

SAILOR automates the construction of symbolic execution harnesses by combining static analysis and LLM-based synthesis, significantly improving the scalability and effectiveness of vulnerability disco…

View →

cs.CRcs.AIRecentMar 29, 2026

A Security Analysis of the OpenClaw AI Agent Framework

Surada Suwansathit, Yuxuan Zhang, Guofei Gu

This paper analyzes 470 security advisories in the OpenClaw AI agent framework, demonstrating that the system's structural weakness lies in per-layer trust enforcement, enabling cross-layer remote cod…

View →

cs.CRRecentMay 7, 2026

Benchmarking Large Language Models for IoC Recovery under Adversarial Code Obfuscation and Encryption

Jaime Morales, Sergio Pastrana, Juan Tapiador

The paper introduces a systematic benchmark to test LLMs' ability to recover Indicators of Compromise (IoCs) from JavaScript code, finding that while LLMs handle simple obfuscation well, encryption-ba…

View →