Papers similar to 2603.18693v2

~ similar to 2603.18693v2· 20 results

cs.SEcs.CRRecentApr 22, 2026

A Ground-Truth-Based Evaluation of Vulnerability Detection Across Multiple Ecosystems

Peter Mandl, Paul Mandl, Martin Häusl, Maximilian Auch

The paper conducts an empirical evaluation of automated vulnerability detection tools across multiple software ecosystems using a curated ground-truth dataset derived from OSV, highlighting systematic…

View →

cs.SEcs.CRRecentMar 27, 2026

A Large-scale Empirical Study on the Generalizability of Disclosed Java Library Vulnerability Exploits

Zirui Chen, Qi Zhan, Jiayuan Zhou, Xing Hu +2 more

This paper conducts a large-scale empirical study demonstrating that Java library exploits can accurately identify affected versions, achieving high recall and precision, and proposes strategies for e…

View →

cs.CRRecentApr 19, 2026

Original Sin of npm: A Study on Vulnerability Propagation in JavaScript Dependency Networks

Michael Robinson, Sajal Halder, Muhammad Ejaz Ahmed, Muhammad Ikram +2 more

The paper analyzes a large dataset of JavaScript packages to demonstrate that a small number of vulnerable dependencies can propagate vulnerabilities across a disproportionately large number of packag…

View →

cs.CRcs.SERecentApr 23, 2026

CrossCommitVuln-Bench: A Dataset of Multi-Commit Python Vulnerabilities Invisible to Per-Commit Static Analysis

Arunabh Majumdar

The paper introduces CrossCommitVuln-Bench, a benchmark dataset demonstrating that many real-world Python vulnerabilities are introduced across multiple commits, making them invisible to standard per-…

View →

cs.CRRecentJun 1, 2026

PyFEX: Uncovering Evasive Python-based Threats via Resilient and Exhaustive Path Exploration

Meng Wang, Yue Ma, Majid Garoosi, Wenting Fan +3 more

PyFEX introduces a resilient forced-execution engine to exhaustively analyze Python code, successfully detecting previously unknown malicious packages and binaries in the Python ecosystem.

View →

cs.CRRecentMay 20, 2026

VIPER-MCP: Detecting and Exploiting Taint-Style Vulnerabilities in Model Context Protocol Servers

Pengyu Sun, Qishu Jin, Enhao Huang, Zifeng Kang +3 more

VIPER-MCP is a novel, end-to-end automated framework that detects and dynamically confirms the exploitability of taint-style vulnerabilities in Model Context Protocol (MCP) servers, achieving high-fid…

View →

cs.CRcs.SERecentApr 5, 2026

LLM-Enabled Open-Source Systems in the Wild: An Empirical Study of Vulnerabilities in GitHub Security Advisories

Fariha Tanjim Shifat, Hariswar Baburaj, Ce Zhou, Jaydeb Sarker +1 more

The paper analyzes GitHub security advisories for LLM-integrated open-source systems, finding that while most vulnerabilities map to existing code-level weaknesses, the architectural risks like Supply…

View →

cs.CRRecentMay 8, 2026

Longitudinal Analyses of SAST Tools: A CodeQL Case Study

Jean-Charles Noirot Ferrand, Kyle Domico, Yohan Beugin, Patrick McDaniel

This study conducts a large-scale longitudinal analysis of CodeQL, finding that while the tool is effective at detecting vulnerabilities, its detection capabilities are not guaranteed to be stable acr…

View →

cs.SEcs.CRRecentApr 9, 2026

Vulnerability Detection with Interprocedural Context in Multiple Languages: Assessing Effectiveness and Cost of Modern LLMs

Kevin Lira, Baldoino Fonseca, Davy Baía, Márcio Ribeiro +1 more

This study assesses the effectiveness and cost of four modern LLMs in detecting vulnerabilities that span multiple functions (interprocedural dependencies), finding that Gemini 3 Flash offers strong c…

View →

cs.CRcs.AIcs.MARecentApr 20, 2026

RAVEN: Retrieval-Augmented Vulnerability Exploration Network for Memory Corruption Analysis in User Code and Binary Programs

Parteek Jamwal, Minghao Shao, Boyuan Chen, Achyuta Muthuvelan +14 more

The paper introduces RAVEN, a Retrieval-Augmented Vulnerability Exploration Network, which uses LLM agents and RAG to automatically generate comprehensive, structured vulnerability analysis reports fo…

View →

cs.CRcs.AIcs.SERecentMay 5, 2026

Cryptographic Registry Provenance: Structural Defense Against Dependency Confusion in AI Package Ecosystems

Alan L. McCann

The paper proposes a comprehensive cryptographic distribution provenance system to structurally defend against dependency confusion attacks in software package ecosystems.

View →

cs.CRcs.AIcs.SERecentApr 22, 2026

Taint-Style Vulnerability Detection and Confirmation for Node.js Packages Using LLM Agent Reasoning

Ronghao Ni, Mihai Christodorescu, Limin Jia

The paper introduces LLMVD.js, a multi-stage LLM agent pipeline that effectively detects and confirms taint-style vulnerabilities in Node.js packages, achieving significantly higher confirmation rates…

View →

cs.CRRecentMay 19, 2026

Hunting Vulnerability Variants in AI Infra: Measurement and Reference-Driven Detection

Tian Dong, Yanjun Chen, Shoufeng Zhang, Huaien Zhang +5 more

This paper measures the prevalence of recurring vulnerability patterns (variants) across multiple AI infrastructure repositories and proposes INFRASCOPE, a framework to automatically detect these vari…

View →

cs.CRcs.AIcs.LGRecentMay 23, 2026

Demystifying the Mythos or Disrupting Bugonomics? From Zero-Day Asymmetry to Defender Remediation Throughput

Alfredo Pesoli, Herman Errico, Lorenzo Cavallaro

The paper argues that the near-term impact of LLM-assisted vulnerability discovery is not simply an increase in zero-day volume, but a critical bottleneck in defender remediation throughput, shifting…

View →

cs.CRcs.DBRecentApr 8, 2026

VulGD: A LLM-Powered Dynamic Open-Access Vulnerability Graph Database

Luat Do, Jiao Yin, Jinli Cao, Hua Wang

VulGD is a dynamic, open-access graph database that aggregates cybersecurity data from multiple sources and uses LLM embeddings to improve vulnerability representation and risk assessment.

View →

cs.CRRecentMar 25, 2026

Bridging Code Property Graphs and Language Models for Program Analysis

Ahmed Lekssays

The paper introduces codebadger, a Model Context Protocol (MCP) server that integrates Joern's Code Property Graph (CPG) with LLMs, enabling large language models to perform large-scale, semantic prog…

View →

cs.CRcs.LGcs.SERecentMay 16, 2026

The Range Shrinks, the Threat Remains: Re-evaluating LLM Package Hallucinations on the 2026 Frontier-Model Cohort

Aleksandr Churilov

This study re-evaluates LLM package hallucination rates on a new cohort of frontier models, finding a significant reduction in overall hallucination rates but identifying a persistent, model-agnostic…

View →

cs.SEcs.CRRecentMay 27, 2026

Towards Demystifying and Repairing LLM-in-the-Loop Vulnerabilities

Yujie Ma, Jialin Rong, Chenxi Yang, Lili Quan +3 more

The paper addresses the gap in understanding real-world LLM-in-the-loop vulnerabilities by creating the LLMCVE dataset and demonstrating that these vulnerabilities are significantly harder to repair t…

View →

cs.SEcs.CRRecentMar 18, 2026

Revisiting Vulnerability Patch Identification on Data in the Wild

Ivana Clairine Irsan, Ratnadira Widyasari, Ting Zhang, Huihui Huang +6 more

The paper demonstrates that security patch detection models trained solely on publicly reported vulnerabilities (NVD) perform poorly when tested on real-world, unreported 'in-the-wild' patches, sugges…

View →

cs.CRRecentMay 30, 2026

NICE: A Framework for Declarative and Machine-Checkable Vulnerability Reproduction

Minh-Luân Nguyen, Olivier Levillain, Julien Malka, Stefano Zacchiroli +1 more

The paper introduces NICE, a declarative framework that uses NixOS to build and automatically validate reproducible environments for demonstrating software vulnerabilities (CVEs), thereby improving th…

View →