~ similar to 2605.07814v1· 20 results
The paper empirically evaluates the security quality of LLM-generated code across various prompting methods, finding that while prompting alters the structure of weaknesses, it is insufficient to reli…
The paper investigates how AI coding assistants shift developers' security focus from proactive prevention to reactive review, finding that this structural change is reinforced by current tool interac…
This study formally verified 3,500 AI-generated code artifacts and found that a majority (55.8%) contain exploitable security vulnerabilities, regardless of the LLM used.
The paper analyzes LLM vulnerability detection using mechanistic interpretability, finding that models primarily rely on safety detectors rather than direct vulnerability signature recognition.
The paper introduces False Security Confidence (FSC), a new metric to measure the inherent prevalence of security vulnerabilities in code generated by LLMs that are otherwise functionally correct, eve…
The paper argues that current Software Bills of Materials (SBOMs) are fundamentally flawed due to a lack of shared understanding regarding what constitutes a 'component,' demonstrating that existing t…
The paper proposes a general, compiler-integrated framework for secure content composition that minimizes the syntactic difference between secure and insecure coding practices.
Zijun Feng, Yuming Feng, Yu Wang, Weizhe Zhang +3 more
GoAT-X introduces a novel framework that structures cross-chain smart contract auditing as a Graph of Auditing Thoughts, significantly improving the detection of complex, semantic vulnerabilities in m…
This review analyzes the dual impact of integrating Large Language Models (LLMs) into hardware design, detailing both their transformative potential in EDA and the critical security vulnerabilities th…
This study conducts a large-scale longitudinal analysis of CodeQL, finding that while the tool is effective at detecting vulnerabilities, its detection capabilities are not guaranteed to be stable acr…
Shenao Yan, Shimaa Ahmed, Shan Jin, Sunpreet S. Arora +3 more
The paper introduces CodeScan, a novel black-box framework that detects data poisoning in code generation LLMs by analyzing structural similarities across multiple generations to identify recurring, v…
This systematic mapping survey reviews label-efficient approaches for code vulnerability detection, synthesizing five paradigm families and providing a decision guide to navigate trade-offs.
Zirui Chen, Qi Zhan, Jiayuan Zhou, Xing Hu +2 more
This paper conducts a large-scale empirical study demonstrating that Java library exploits can accurately identify affected versions, achieving high recall and precision, and proposes strategies for e…
Hao Wang, Niels Mündler, Mark Vero, Jingxuan He +2 more
The paper introduces SecPI, a fine-tuning pipeline that teaches reasoning language models (RLMs) to autonomously internalize structured security reasoning, significantly improving secure code generati…
This paper systematically surveys adaptive and AI-augmented security testing, concluding that a major gap exists—structural-adaptive fragmentation—where current systems fail to integrate structural pr…
This paper identifies the 'Format-Reliability Gap'—where LLMs know about code vulnerabilities but generate insecure code anyway—and proposes a localized, per-vulnerability steering vector fix that sig…
This survey analyzed 132 web application security tutorials, finding that most lack concrete implementation details and recommending that the presence of runnable code and links to official resources…
Shams Tarek, Dipayan Saha, Khan Thamid Hasan, Sujan Kumar Saha +2 more
Assertain is an automated framework that uses large language models and design analysis to generate high-quality, executable security assertions for hardware designs, significantly outperforming state…
The study analyzes coding patterns in malware versus benign software, finding that malware code is optimized for quick evasion and secrecy rather than maintainability, though its metrics are not uniqu…
The paper proposes an automated, standardized framework to empirically compare the security quality of code generated through human-only, LLM-only, and hybrid collaboration methods.