Papers similar to 2605.13100v1

~ similar to 2605.13100v1· 20 results

cs.HCcs.CRRecentMay 22, 2026

From Preventive to Reactive: How AI Coding Assistants Transform Developers' Security Awareness

Faisal Haque Bappy, Tahrim Hossain, Sidratul Muntaher Meheraj, Annoor Sharara Akhand +4 more

The paper investigates how AI coding assistants shift developers' security focus from proactive prevention to reactive review, finding that this structural change is reinforced by current tool interac…

View →

cs.SEcs.CRRecentApr 15, 2026

Analysis of Commit Signing on Github

Abubakar Sadiq Shittu, John Sadik, Farzin Gholamrezae, Scott Ruoti

This study provides an ecosystem-scale measurement of commit signing on GitHub, finding that current signing adoption rates are misleading and that developers struggle to maintain consistent, long-ter…

View →

cs.CRcs.AIcs.SERecentMay 5, 2026

MOSAIC-Bench: Measuring Compositional Vulnerability Induction in Coding Agents

Jonathan Steinberg, Oren Gal

The paper introduces MOSAIC-Bench, a benchmark demonstrating that coding agents can ship exploitable code by complying with seemingly innocuous, staged tasks, a vulnerability that is not easily mitiga…

View →

cs.CRcs.AIcs.LGRecentMay 23, 2026

Demystifying the Mythos or Disrupting Bugonomics? From Zero-Day Asymmetry to Defender Remediation Throughput

Alfredo Pesoli, Herman Errico, Lorenzo Cavallaro

The paper argues that the near-term impact of LLM-assisted vulnerability discovery is not simply an increase in zero-day volume, but a critical bottleneck in defender remediation throughput, shifting…

View →

cs.CRcs.AIcs.LGRecentMay 22, 2026

An Empirical Evaluation of LLM-Generated Code Security Across Prompting Methods

Mohammed Kharma, Ahmed Sabbah, Mohammad Alkhanafseh, Mohammad Hammoudeh +1 more

The paper empirically evaluates the security quality of LLM-generated code across various prompting methods, finding that while prompting alters the structure of weaknesses, it is insufficient to reli…

View →

cs.CRcs.CLcs.CYRecentMay 8, 2026

SecureForge: Finding and Preventing Vulnerabilities in LLM-Generated Code via Prompt Optimization

Houjun Liu, Lisa Einstein, John Yang, Joachim Baumann +4 more

SecureForge is an automated pipeline that significantly reduces cybersecurity vulnerabilities in LLM-generated code by optimizing system prompts, achieving up to a 48% reduction in output vulnerabilit…

View →

cs.CRcs.LGRecentApr 20, 2026

A Quasi-Experimental Developer Study of Security Training in LLM-Assisted Web Application Development

Mohammed Kharma, Ahmed Sabbah, Radi Jarrar, Samer Zain +2 more

The study found that providing developers with a layer-based security training package significantly reduces the number and severity of security vulnerabilities in LLM-assisted web application develop…

View →

cs.CRcs.CERecentApr 5, 2026

Refunded but Rewarded: The Double Dip Attack on Cashback Reward Engines

S M Zia Ur Rashid, Suman Rath

The paper analyzes and documents various double-dip reward abuse attacks that exploit flaws in how cashback and reward engines handle transaction refunds, proposing formal invariants and defensive alg…

View →

cs.CRcs.CLcs.LGRecentMay 27, 2026

Code as a Weapon: A Consensus-Labeled Prompt Bank for Measuring Coding-Model Compliance with Malicious-Code Requests

Richard J. Young, Gregory D. Moody

The paper introduces a large, consensus-labeled prompt bank that reliably distinguishes between requests for executable malicious code and requests for harmful security knowledge, providing a standard…

View →

cs.CRRecentJun 4, 2026

Exploring the connection between coding habits and cognitive styles in malware developers

Vasilis Vouvoutsis, Constantinos Patsakis, Fran Casino

The study analyzes coding patterns in malware versus benign software, finding that malware code is optimized for quick evasion and secrecy rather than maintainability, though its metrics are not uniqu…

View →

cs.CRcs.AIcs.LGRecentMay 22, 2026

Enhancing Reliability in LLM-Based Secure Code Generation

Mohammed F. Kharma, Mohammad Alkhanafseh, Ahmed Sabbah, David Mohaisen

The paper introduces the Mitigation-Aware Chain-of-Thought (MA-CoT) framework, which significantly enhances the security reliability of code generated by LLMs across multiple languages and models.

View →

cs.CRcs.CLcs.SERecentMay 28, 2026

Minimal Prompt Perturbations Lead to Code Vulnerabilities: Prompt Fragility and Hidden-State Signals in Coding LLMs

Alexander Sternfeld, Andrei Kucharavy, Ljiljana Dolamic

Minor, single-character perturbations to prompts can significantly degrade the security of code generated by LLMs, suggesting that prompt fragility is a major security concern beyond simple prompt inj…

View →

cs.CRcs.SERecentMar 26, 2026

AVDA: Autonomous Vibe Detection Authoring for Cybersecurity

Fatih Bulut, Carlo DePaolis, Raghav Batta, Anjali Mangal

The paper introduces AVDA, a framework that uses the Model Context Protocol (MCP) to automate cybersecurity detection authoring by integrating organizational context into AI code generation, achieving…

View →

cs.CRcs.CYcs.SERecentApr 15, 2026

Towards Personalizing Secure Programming Education with LLM-Injected Vulnerabilities

Matthew Frazier, Kostadin Damevski

The paper proposes using LLMs to inject personalized security vulnerabilities (CWEs) into students' own code to improve secure programming education, finding that while students found the method engag…

View →

cs.CRcs.AIRecentMay 22, 2026

AI Security Research Should Better Incentivize Defense Research

Youqian Zhang

The paper argues that AI security research is imbalanced, focusing too much on demonstrating attacks and not enough on developing practical, usable defenses.

View →

cs.CRcs.AIRecentApr 2, 2026

Seclens: Role-specific Evaluation of LLM's for security vulnerablity detection

Subho Halder, Siddharth Saxena, Kashinath Kadaba Shrish, Thiyagarajan M

The paper introduces SecLens-R, a multi-stakeholder evaluation framework, demonstrating that LLM performance for vulnerability detection varies significantly depending on the specific priorities (e.g.…

View →

cs.SEcs.AIcs.HCRecentMay 28, 2026

How Coding Agents Fail Their Users: A Large-Scale Analysis of Developer-Agent Misalignment in 20,574 Real-World Sessions

Ningzhi Tang, Chaoran Chen, Gelei Xu, Yiyu Shi +4 more

This study analyzes over 20,000 real-world coding sessions to show that AI coding agents frequently fail users through subtle misalignment, requiring constant manual correction even when major system…

View →

cs.SEcs.CRRecentMar 27, 2026

A Large-scale Empirical Study on the Generalizability of Disclosed Java Library Vulnerability Exploits

Zirui Chen, Qi Zhan, Jiayuan Zhou, Xing Hu +2 more

This paper conducts a large-scale empirical study demonstrating that Java library exploits can accurately identify affected versions, achieving high recall and precision, and proposes strategies for e…

View →

cs.CRcs.AIRecentApr 1, 2026

VibeGuard: A Security Gate Framework for AI-Generated Code

Ying Xie

The paper introduces VibeGuard, a pre-publish security gate framework designed to detect novel vulnerabilities—such as source map exposure and packaging drift—that arise from developers over-relying o…

View →

cs.CRRecentApr 18, 2026

False Security Confidence in Benign LLM Code Generation

Xiaolei Ren

The paper introduces False Security Confidence (FSC), a new metric to measure the inherent prevalence of security vulnerabilities in code generated by LLMs that are otherwise functionally correct, eve…

View →