Papers similar to 2604.01052v1

~ similar to 2604.01052v1· 20 results

cs.SEcs.AIcs.CRRecentApr 4, 2026

Measuring the Permission Gate: A Stress-Test Evaluation of Claude Code's Auto Mode

Zimo Ji, Zongjie Li, Wenyuan Jiang, Yudong Gao +1 more

The paper independently stress-tests Claude Code's auto mode permission system using a deliberately ambiguous benchmark, finding that its true false negative rate is significantly higher than reported…

View →

cs.HCcs.CRRecentMay 22, 2026

From Preventive to Reactive: How AI Coding Assistants Transform Developers' Security Awareness

Faisal Haque Bappy, Tahrim Hossain, Sidratul Muntaher Meheraj, Annoor Sharara Akhand +4 more

The paper investigates how AI coding assistants shift developers' security focus from proactive prevention to reactive review, finding that this structural change is reinforced by current tool interac…

View →

cs.CRcs.SERecentMar 26, 2026

AVDA: Autonomous Vibe Detection Authoring for Cybersecurity

Fatih Bulut, Carlo DePaolis, Raghav Batta, Anjali Mangal

The paper introduces AVDA, a framework that uses the Model Context Protocol (MCP) to automate cybersecurity detection authoring by integrating organizational context into AI code generation, achieving…

View →

cs.CRRecentApr 1, 2026

Obfuscating Code Vulnerabilities against Static Analysis in JavaScript Code

Francesco Pagano, Lorenzo Pisu, Leonardo Regano, Davide Maiorca +2 more

This paper empirically demonstrates that current Static Application Security Testing (SAST) tools are fundamentally unreliable against common JavaScript obfuscation techniques, showing that obfuscatio…

View →

cs.SEcs.CRcs.HCRecentApr 9, 2026

Security Concerns in Generative AI Coding Assistants: Insights from Online Discussions on GitHub Copilot

Nicolás E. Díaz Ferreyra, Monika Swetha Gurupathi, Zadia Codabux, Nalin Arachchilage +1 more

This paper analyzes online developer discussions to identify four major security concerns—data leakage, code licensing, adversarial attacks, and insecure suggestions—associated with using generative A…

View →

cs.CRcs.SERecentMay 29, 2026

How to Compare the Security of Code Written by Humans to LLM-generated Code

Rebecca Balebako, Jasmine Egl

The paper proposes an automated, standardized framework to empirically compare the security quality of code generated through human-only, LLM-only, and hybrid collaboration methods.

View →

cs.SEcs.AIcs.CRRecentApr 10, 2026

DeepGuard: Secure Code Generation via Multi-Layer Semantic Aggregation

Li Huang, Zhongxin Liu, Yifan Wu, Tao Yin +5 more

DeepGuard introduces a novel multi-layer semantic aggregation framework to enhance secure code generation by collecting vulnerability cues from multiple upper layers of LLMs, significantly improving s…

View →

cs.CRcs.SERecentMay 13, 2026

Security Incentivization: An Empirical Study of how Micropayments Impact Code Security

Stefan Rass, Martin Pinzger, Rainer W. Alexandrowicz, Georg Sengstbratl +4 more

The paper demonstrates that linking team bonus points to measurable security improvements significantly reduces code security issues in a controlled educational experiment.

View →

cs.CRcs.AIRecentApr 4, 2026

SecPI: Secure Code Generation with Reasoning Models via Security Reasoning Internalization

Hao Wang, Niels Mündler, Mark Vero, Jingxuan He +2 more

The paper introduces SecPI, a fine-tuning pipeline that teaches reasoning language models (RLMs) to autonomously internalize structured security reasoning, significantly improving secure code generati…

View →

cs.PLcs.CRRecentMay 15, 2026

Compile-time Security Analysis and Optimization of Sensitive String Producers

Mike Samuel, Tom Palmer, Shaw Summa, Robert Grayson

The paper proposes a general, compiler-integrated framework for secure content composition that minimizes the syntactic difference between secure and insecure coding practices.

View →

cs.CRRecentApr 27, 2026

GoAT-X: A Graph of Auditing Thoughts for Securing Token Transactions in Cross-Chain Contracts

Zijun Feng, Yuming Feng, Yu Wang, Weizhe Zhang +3 more

GoAT-X introduces a novel framework that structures cross-chain smart contract auditing as a Graph of Auditing Thoughts, significantly improving the detection of complex, semantic vulnerabilities in m…

View →

cs.CRcs.AIRecentMay 12, 2026

CoT-Guard: Small Models for Strong Monitoring

Nirav Diwan, Han Wang, Berkcan Kapusuzoglu, Ramin Moradi +5 more

The paper introduces CoT-Guard, a small, cost-effective 4B-parameter model that significantly outperforms large, expensive monitors like GPT-5 in detecting hidden objectives in code generation tasks.

View →

cs.CRcs.CLcs.CYRecentMay 8, 2026

SecureForge: Finding and Preventing Vulnerabilities in LLM-Generated Code via Prompt Optimization

Houjun Liu, Lisa Einstein, John Yang, Joachim Baumann +4 more

SecureForge is an automated pipeline that significantly reduces cybersecurity vulnerabilities in LLM-generated code by optimizing system prompts, achieving up to a 48% reduction in output vulnerabilit…

View →

cs.CRcs.AIcs.CLRecentApr 2, 2026

RuleForge: Automated Generation and Validation for Web Vulnerability Detection at Scale

Ayush Garg, Sophia Hager, Jacob Montiel, Aditya Tiwari +4 more

RuleForge is an automated system that generates and validates detection rules for web vulnerabilities from structured CVE templates, significantly improving detection accuracy and reducing false posit…

View →

cs.CRcs.AIRecentJun 3, 2026

From Attack Simulation to SIEM Rule: Deterministic Detection-as-Code Synthesis with Probe-Level Traceability

Alexandre Cristovão Maiorano

The paper introduces a deterministic method to automatically synthesize initial SIEM detection rules (Sigma rules) from attack simulation findings, ensuring full traceability back to the specific orig…

View →

cs.CRRecentApr 18, 2026

False Security Confidence in Benign LLM Code Generation

Xiaolei Ren

The paper introduces False Security Confidence (FSC), a new metric to measure the inherent prevalence of security vulnerabilities in code generated by LLMs that are otherwise functionally correct, eve…

View →

cs.CRcs.AIcs.LGRecentMay 22, 2026

An Empirical Evaluation of LLM-Generated Code Security Across Prompting Methods

Mohammed Kharma, Ahmed Sabbah, Mohammad Alkhanafseh, Mohammad Hammoudeh +1 more

The paper empirically evaluates the security quality of LLM-generated code across various prompting methods, finding that while prompting alters the structure of weaknesses, it is insufficient to reli…

View →

cs.CRcs.AIcs.LGRecentMay 22, 2026

Enhancing Reliability in LLM-Based Secure Code Generation

Mohammed F. Kharma, Mohammad Alkhanafseh, Ahmed Sabbah, David Mohaisen

The paper introduces the Mitigation-Aware Chain-of-Thought (MA-CoT) framework, which significantly enhances the security reliability of code generated by LLMs across multiple languages and models.

View →

cs.CRcs.CLRecentMay 31, 2026

BraveGuard: From Open-World Threats to Safer Computer-Use Agents

Yunhao Feng, Xiaohu Du, Xinhao Deng, Yifan Ding +12 more

BraveGuard is a self-evolving defense framework that significantly improves the safety monitoring of computer-use agents by generating guard model supervision from open-world threat discovery and real…

View →

cs.CRcs.CLRecentMay 31, 2026

BraveGuard: From Open-World Threats to Safer Computer-Use Agents

Yunhao Feng, Yifan Ding, Xiaohu Du, Ming Wen +12 more

BraveGuard is a self-evolving defense framework that improves the safety of computer-use agents by training guard models on open-world, multi-step threat trajectories rather than static benchmarks.

View →