~ similar to 2606.00625v1· 20 results
Pengyu Sun, Qishu Jin, Enhao Huang, Zifeng Kang +3 more
VIPER-MCP is a novel, end-to-end automated framework that detects and dynamically confirms the exploitability of taint-style vulnerabilities in Model Context Protocol (MCP) servers, achieving high-fid…
Zirui Chen, Qi Zhan, Jiayuan Zhou, Xing Hu +2 more
This paper conducts a large-scale empirical study demonstrating that Java library exploits can accurately identify affected versions, achieving high recall and precision, and proposes strategies for e…
The paper conducts an empirical evaluation of automated vulnerability detection tools across multiple software ecosystems using a curated ground-truth dataset derived from OSV, highlighting systematic…
The paper provides a formal proof that a single C program can contain a countably infinite number of distinct, independently assignable software vulnerabilities, suggesting the set of all software vul…
The paper proposes GCVE, a decentralized, open, and extensible socio-technical model to standardize and enrich the entire lifecycle of vulnerability information, moving beyond simple identifier alloca…
The paper introduces a novel multi-LLM orchestration system combined with symbolic execution to successfully detect memory vulnerabilities in uncompilable, incomplete Rust CVE code snippets, achieving…
The paper introduces a novel, large-scale dataset of vulnerable code snippets linked to CAPEC and CWE, generated using advanced LLMs, to improve automatic vulnerability detection.
Ayush Garg, Sophia Hager, Jacob Montiel, Aditya Tiwari +4 more
RuleForge is an automated system that generates and validates detection rules for web vulnerabilities from structured CVE templates, significantly improving detection accuracy and reducing false posit…
The paper empirically evaluates the security quality of LLM-generated code across various prompting methods, finding that while prompting alters the structure of weaknesses, it is insufficient to reli…
The paper introduces False Security Confidence (FSC), a new metric to measure the inherent prevalence of security vulnerabilities in code generated by LLMs that are otherwise functionally correct, eve…
This paper empirically demonstrates that current Static Application Security Testing (SAST) tools are fundamentally unreliable against common JavaScript obfuscation techniques, showing that obfuscatio…
The paper presents Broken Quantum, a comprehensive formal security audit that identifies 547 security vulnerabilities across 45 open-source quantum computing simulators, revealing critical flaws in me…
The paper proposes an attestation-aware promotion gate to mitigate supply-chain risks in LLM pipelines by cryptographically verifying and enforcing claims about training and release artifacts before d…
Tian Dong, Yanjun Chen, Shoufeng Zhang, Huaien Zhang +5 more
This paper measures the prevalence of recurring vulnerability patterns (variants) across multiple AI infrastructure repositories and proposes INFRASCOPE, a framework to automatically detect these vari…
Ze Sheng, Zhicheng Chen, Qingxiao Xu, Kewen Zhu +1 more
FuzzingBrain V2 is a multi-agent LLM system that significantly improves automated vulnerability discovery by ensuring all reported bugs are fuzzer-reproducible and handling complex cross-function depe…
Shravya Kanchi, Xiaoyan Zang, Ying Zhang, Danfeng Yao +1 more
The paper introduces PoVSmith, an agent-based system that uses large language models and call path analysis to automatically generate and assess proof-of-vulnerability tests, significantly improving t…
This paper systematically surveys adaptive and AI-augmented security testing, concluding that a major gap exists—structural-adaptive fragmentation—where current systems fail to integrate structural pr…
Parteek Jamwal, Minghao Shao, Boyuan Chen, Achyuta Muthuvelan +14 more
The paper introduces RAVEN, a Retrieval-Augmented Vulnerability Exploration Network, which uses LLM agents and RAG to automatically generate comprehensive, structured vulnerability analysis reports fo…
The paper introduces CrossCommitVuln-Bench, a benchmark dataset demonstrating that many real-world Python vulnerabilities are introduced across multiple commits, making them invisible to standard per-…
Ze Sheng, Dmitrijs Trizna, Luigino Camastra, Zhicheng Chen +2 more
The paper introduces QuartetFuzz, an autonomous system that systematically ensures the correctness of fuzzing harnesses using a novel Four Principles framework, significantly improving vulnerability d…