ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2605.13138v1· 20 results

cs.CRcs.SERecentApr 23, 2026

CrossCommitVuln-Bench: A Dataset of Multi-Commit Python Vulnerabilities Invisible to Per-Commit Static Analysis

Arunabh Majumdar

The paper introduces CrossCommitVuln-Bench, a benchmark dataset demonstrating that many real-world Python vulnerabilities are introduced across multiple commits, making them invisible to standard per-…

View →
cs.SEcs.CRRecentMar 18, 2026

Revisiting Vulnerability Patch Identification on Data in the Wild

Ivana Clairine Irsan, Ratnadira Widyasari, Ting Zhang, Huihui Huang +6 more

The paper demonstrates that security patch detection models trained solely on publicly reported vulnerabilities (NVD) perform poorly when tested on real-world, unreported 'in-the-wild' patches, sugges…

View →
cs.CRcs.SERecentApr 27, 2026

MAS-SZZ: Multi-Agentic SZZ Algorithm for Vulnerability-Inducing Commit Identification

Sicong Cao, Jinxuan Xu, Le Yu, Jing Yang +3 more

The paper proposes MAS-SZZ, a multi-agentic algorithm that significantly improves the identification of the earliest commit introducing a software vulnerability by combining root cause analysis with s…

View →
cs.CRRecentMay 8, 2026

Longitudinal Analyses of SAST Tools: A CodeQL Case Study

Jean-Charles Noirot Ferrand, Kyle Domico, Yohan Beugin, Patrick McDaniel

This study conducts a large-scale longitudinal analysis of CodeQL, finding that while the tool is effective at detecting vulnerabilities, its detection capabilities are not guaranteed to be stable acr…

View →
cs.CRRecentMar 30, 2026

VulnScout-C: A Lightweight Transformer for C Code Vulnerability Detection

Aymen Lassoued, Nacef Mbarek, Bechir Dardouri, Bassem Ouni +2 more

The paper introduces VULNSCOUT-C, a compact, specialized transformer model that achieves state-of-the-art performance in C code vulnerability detection while maintaining low inference cost, making it…

View →
cs.SEcs.CRRecentApr 22, 2026

Residual Risk Analysis in Benign Code: How Far Are We? A Multi-Model Semantic and Structural Similarity Approach

Mohammad Farhad, Shuvalaxmi Dass

The paper proposes a Residual Risk Scoring (RRS) framework that uses combined semantic and structural similarity analysis to estimate potential residual security risks in code after patching, finding…

View →
cs.CRcs.SERecentMar 31, 2026

When Labels Are Scarce: A Systematic Mapping of Label-Efficient Code Vulnerability Detection

Noor Khalal, Chakib Fettal, Lazhar Labiod, Mohamed Nadif

This systematic mapping survey reviews label-efficient approaches for code vulnerability detection, synthesizing five paradigm families and providing a decision guide to navigate trade-offs.

View →
cs.CRcs.AIRecentApr 2, 2026

From Theory to Practice: Code Generation Using LLMs for CAPEC and CWE Frameworks

Murtuza Shahzad, Joseph Wilson, Ibrahim Al Azher, Hamed Alhoori +1 more

The paper introduces a novel, large-scale dataset of vulnerable code snippets linked to CAPEC and CWE, generated using advanced LLMs, to improve automatic vulnerability detection.

View →
cs.CRcs.AIcs.MARecentApr 20, 2026

RAVEN: Retrieval-Augmented Vulnerability Exploration Network for Memory Corruption Analysis in User Code and Binary Programs

Parteek Jamwal, Minghao Shao, Boyuan Chen, Achyuta Muthuvelan +14 more

The paper introduces RAVEN, a Retrieval-Augmented Vulnerability Exploration Network, which uses LLM agents and RAG to automatically generate comprehensive, structured vulnerability analysis reports fo…

View →
cs.SEcs.CRRecentMay 27, 2026

Towards Demystifying and Repairing LLM-in-the-Loop Vulnerabilities

Yujie Ma, Jialin Rong, Chenxi Yang, Lili Quan +3 more

The paper addresses the gap in understanding real-world LLM-in-the-loop vulnerabilities by creating the LLMCVE dataset and demonstrating that these vulnerabilities are significantly harder to repair t…

View →
cs.CRRecentMar 25, 2026

Bridging Code Property Graphs and Language Models for Program Analysis

Ahmed Lekssays

The paper introduces codebadger, a Model Context Protocol (MCP) server that integrates Joern's Code Property Graph (CPG) with LLMs, enabling large language models to perform large-scale, semantic prog…

View →
cs.CRcs.AIRecentMay 7, 2026

LCC-LLM: Leveraging Code-Centric Large Language Models for Malware Attribution

Christopher G. Pedraza Pohlenz, Hassan Jalil Hadi, Ali Hassan, Ali Shoker

The paper introduces LCC-LLM, a code-centric framework and dataset that significantly improves the reliability of malware attribution and static analysis by grounding LLM reasoning in comprehensive, m…

View →
cs.SEcs.CRRecentMar 27, 2026

A Large-scale Empirical Study on the Generalizability of Disclosed Java Library Vulnerability Exploits

Zirui Chen, Qi Zhan, Jiayuan Zhou, Xing Hu +2 more

This paper conducts a large-scale empirical study demonstrating that Java library exploits can accurately identify affected versions, achieving high recall and precision, and proposes strategies for e…

View →
cs.CRRecentMay 19, 2026

Hunting Vulnerability Variants in AI Infra: Measurement and Reference-Driven Detection

Tian Dong, Yanjun Chen, Shoufeng Zhang, Huaien Zhang +5 more

This paper measures the prevalence of recurring vulnerability patterns (variants) across multiple AI infrastructure repositories and proposes INFRASCOPE, a framework to automatically detect these vari…

View →
cs.SEcs.AIcs.CRRecentApr 12, 2026

Verify Before You Fix: Agentic Execution Grounding for Trustworthy Cross-Language Code Analysis

Jugal Gajjar

The paper introduces an execution-grounded, cross-language framework that significantly improves the reliability of LLM-driven code vulnerability analysis by ensuring that all proposed fixes are confi…

View →
cs.CRcs.AIRecentApr 14, 2026

LogicEval: A Systematic Framework for Evaluating Automated Repair Techniques for Logical Vulnerabilities in Real-World Software

Syed Md Mukit Rashid, Abdullah Al Ishtiaq, Kai Tu, Yilu Dong +6 more

The paper introduces LogicEval, a systematic framework and dataset (LogicDS) to evaluate automated repair techniques for logical software vulnerabilities, finding that prompt sensitivity and context l…

View →
cs.CRcs.LGRecentMay 26, 2026

SEC-bench Pro: Can Language Models Solve Long-Horizon Software Security Tasks?

Hwiwon Lee, Jiawei Liu, Dongjun Kim, Ziqi Zhang +2 more

The paper introduces SEC-bench Pro, a rigorous benchmark for evaluating LLM-based bug hunting on complex software, finding that even advanced agents struggle with long-horizon security tasks.

View →
cs.CRcs.LGRecentApr 29, 2026

VulStyle: A Multi-Modal Pre-Training for Code Stylometry-Augmented Vulnerability Detection

Chidera Biringa, Ajmal Abbas, Vishnu Selvaraj, Gokhan Kul

VulStyle introduces a multi-modal model that jointly encodes source code, non-terminal AST structure, and code stylometry features to achieve state-of-the-art performance in software vulnerability det…

View →
cs.CRcs.SERecentMay 3, 2026

VulKey: Automated Vulnerability Repair Guided by Domain-Specific Repair Patterns

Jia Li, Zhuangbin Chen, Yuxin Su, Michael R. Lyu

VulKey introduces a novel LLM-based framework that uses a hierarchical abstraction of expert security knowledge to guide automatic vulnerability repair, achieving state-of-the-art performance on real-…

View →
cs.CRcs.AIcs.SERecentMay 5, 2026

MOSAIC-Bench: Measuring Compositional Vulnerability Induction in Coding Agents

Jonathan Steinberg, Oren Gal

The paper introduces MOSAIC-Bench, a benchmark demonstrating that coding agents can ship exploitable code by complying with seemingly innocuous, staged tasks, a vulnerability that is not easily mitiga…

View →