~ similar to 2605.16561v1· 20 results
Minor, single-character perturbations to prompts can significantly degrade the security of code generated by LLMs, suggesting that prompt fragility is a major security concern beyond simple prompt inj…
The paper empirically evaluates the security quality of LLM-generated code across various prompting methods, finding that while prompting alters the structure of weaknesses, it is insufficient to reli…
Hao Wang, Niels Mündler, Mark Vero, Jingxuan He +2 more
The paper introduces SecPI, a fine-tuning pipeline that teaches reasoning language models (RLMs) to autonomously internalize structured security reasoning, significantly improving secure code generati…
SAILOR automates the construction of symbolic execution harnesses by combining static analysis and LLM-based synthesis, significantly improving the scalability and effectiveness of vulnerability disco…
VulStyle introduces a multi-modal model that jointly encodes source code, non-terminal AST structure, and code stylometry features to achieve state-of-the-art performance in software vulnerability det…
The paper proposes an automated, standardized framework to empirically compare the security quality of code generated through human-only, LLM-only, and hybrid collaboration methods.
Shenao Yan, Shimaa Ahmed, Shan Jin, Sunpreet S. Arora +3 more
The paper introduces CodeScan, a novel black-box framework that detects data poisoning in code generation LLMs by analyzing structural similarities across multiple generations to identify recurring, v…
This study formally verified 3,500 AI-generated code artifacts and found that a majority (55.8%) contain exploitable security vulnerabilities, regardless of the LLM used.
This paper identifies the 'Format-Reliability Gap'—where LLMs know about code vulnerabilities but generate insecure code anyway—and proposes a localized, per-vulnerability steering vector fix that sig…
The paper introduces codebadger, a Model Context Protocol (MCP) server that integrates Joern's Code Property Graph (CPG) with LLMs, enabling large language models to perform large-scale, semantic prog…
Meifang Chen, Zhe Yang, Huang Nianchen, Yizhan Huang +3 more
This paper investigates how Byte-Pair Encoding (BPE) tokenization causes Code LLMs to disproportionately memorize certain types of secrets, a phenomenon termed 'gibberish bias'.
Houjun Liu, Lisa Einstein, John Yang, Joachim Baumann +4 more
SecureForge is an automated pipeline that significantly reduces cybersecurity vulnerabilities in LLM-generated code by optimizing system prompts, achieving up to a 48% reduction in output vulnerabilit…
The paper introduces a novel, large-scale dataset of vulnerable code snippets linked to CAPEC and CWE, generated using advanced LLMs, to improve automatic vulnerability detection.
Parteek Jamwal, Minghao Shao, Boyuan Chen, Achyuta Muthuvelan +14 more
The paper introduces RAVEN, a Retrieval-Augmented Vulnerability Exploration Network, which uses LLM agents and RAG to automatically generate comprehensive, structured vulnerability analysis reports fo…
Chengyan Ma, Jieke Shi, Ruidong Han, Ye Liu +2 more
The paper introduces SymTEE, an LLM-assisted symbolic execution framework that detects missing input validation vulnerabilities in TEE applications without needing complex, real TEE setups.
This paper empirically demonstrates that current Static Application Security Testing (SAST) tools are fundamentally unreliable against common JavaScript obfuscation techniques, showing that obfuscatio…
The paper introduces the Mitigation-Aware Chain-of-Thought (MA-CoT) framework, which significantly enhances the security reliability of code generated by LLMs across multiple languages and models.
This study conducts a large-scale longitudinal analysis of CodeQL, finding that while the tool is effective at detecting vulnerabilities, its detection capabilities are not guaranteed to be stable acr…
Aymen Lassoued, Nacef Mbarek, Bechir Dardouri, Bassem Ouni +2 more
The paper introduces VULNSCOUT-C, a compact, specialized transformer model that achieves state-of-the-art performance in C code vulnerability detection while maintaining low inference cost, making it…
This systematic mapping survey reviews label-efficient approaches for code vulnerability detection, synthesizing five paradigm families and providing a decision guide to navigate trade-offs.