Papers similar to 2604.17256v1

~ similar to 2604.17256v1· 20 results

cs.SEcs.CRRecentApr 22, 2026

A Ground-Truth-Based Evaluation of Vulnerability Detection Across Multiple Ecosystems

Peter Mandl, Paul Mandl, Martin Häusl, Maximilian Auch

The paper conducts an empirical evaluation of automated vulnerability detection tools across multiple software ecosystems using a curated ground-truth dataset derived from OSV, highlighting systematic…

View →

cs.CRRecentMay 30, 2026

NICE: A Framework for Declarative and Machine-Checkable Vulnerability Reproduction

Minh-Luân Nguyen, Olivier Levillain, Julien Malka, Stefano Zacchiroli +1 more

The paper introduces NICE, a declarative framework that uses NixOS to build and automatically validate reproducible environments for demonstrating software vulnerabilities (CVEs), thereby improving th…

View →

cs.CRcs.AIcs.CLRecentApr 2, 2026

RuleForge: Automated Generation and Validation for Web Vulnerability Detection at Scale

Ayush Garg, Sophia Hager, Jacob Montiel, Aditya Tiwari +4 more

RuleForge is an automated system that generates and validates detection rules for web vulnerabilities from structured CVE templates, significantly improving detection accuracy and reducing false posit…

View →

cs.CRcs.MARecentJun 3, 2026

SHIELDS: Automating OS Hardening with Iterative Multi-Agent Remediation

Andrew Hamara, Dwight Horne, Aldehir Rojas, Timothy Kurniawan +4 more

SHIELDS is a multi-agent system that uses LLMs to automate OS hardening by iteratively proposing and refining fixes based on real-time system feedback, achieving up to 73% remediation success.

View →

cs.CRRecentApr 23, 2026

MCP Pitfall Lab: Exposing Developer Pitfalls in MCP Tool Server Security under Multi-Vector Attacks

Run Hao, Zhuoran Tan

The paper introduces MCP Pitfall Lab, a comprehensive security testing framework that rigorously assesses and validates developer pitfalls in Model Context Protocol (MCP) tool servers under realistic…

View →

cs.CRcs.AIRecentMay 26, 2026

ChainCaps: Composition-Safe Tool-Using Agents via Monotonic Capability Attenuation

Xiaochong Jiang, Shiqi Yang, Ziwei Li, Lifei Liu +2 more

ChainCaps introduces a novel runtime capability budgeting system that prevents 'permission laundering' in complex tool-using agents, significantly reducing attack success rates while maintaining benig…

View →

cs.CRcs.DBRecentApr 8, 2026

VulGD: A LLM-Powered Dynamic Open-Access Vulnerability Graph Database

Luat Do, Jiao Yin, Jinli Cao, Hua Wang

VulGD is a dynamic, open-access graph database that aggregates cybersecurity data from multiple sources and uses LLM embeddings to improve vulnerability representation and risk assessment.

View →

cs.CRRecentMay 7, 2026

Beyond Collection: Measuring the Detection Efficacy of Modern Security Logging Standards

Ryan Holeman, John Hastings, Varghese Mathew Vaidyan

This paper systematically evaluates modern security logging standards (CIM, OCSF, ECS) using a novel framework to quantify their detection efficacy across diverse exploit scenarios, revealing critical…

View →

cs.CRcs.SERecentApr 23, 2026

CrossCommitVuln-Bench: A Dataset of Multi-Commit Python Vulnerabilities Invisible to Per-Commit Static Analysis

Arunabh Majumdar

The paper introduces CrossCommitVuln-Bench, a benchmark dataset demonstrating that many real-world Python vulnerabilities are introduced across multiple commits, making them invisible to standard per-…

View →

cs.SEcs.CRRecentMay 21, 2026

Finding Missing Input Validation in TEEs via LLM-Assisted Symbolic Execution

Chengyan Ma, Jieke Shi, Ruidong Han, Ye Liu +2 more

The paper introduces SymTEE, an LLM-assisted symbolic execution framework that detects missing input validation vulnerabilities in TEE applications without needing complex, real TEE setups.

View →

cs.CRcs.SERecentMar 19, 2026

Cross-Ecosystem Vulnerability Analysis for Python Applications

Georgios Alexopoulos, Nikolaos Alexopoulos, Thodoris Sotiropoulos, Charalambos Mitropoulos +2 more

The paper introduces a provenance-aware vulnerability analysis approach that accurately identifies cross-ecosystem vulnerabilities in Python applications by resolving vendored native libraries to spec…

View →

cs.SEcs.CRRecentMar 27, 2026

A Large-scale Empirical Study on the Generalizability of Disclosed Java Library Vulnerability Exploits

Zirui Chen, Qi Zhan, Jiayuan Zhou, Xing Hu +2 more

This paper conducts a large-scale empirical study demonstrating that Java library exploits can accurately identify affected versions, achieving high recall and precision, and proposes strategies for e…

View →

cs.CRRecentApr 19, 2026

Original Sin of npm: A Study on Vulnerability Propagation in JavaScript Dependency Networks

Michael Robinson, Sajal Halder, Muhammad Ejaz Ahmed, Muhammad Ikram +2 more

The paper analyzes a large dataset of JavaScript packages to demonstrate that a small number of vulnerable dependencies can propagate vulnerabilities across a disproportionately large number of packag…

View →

cs.CRcs.SERecentApr 2, 2026

From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers

Yiheng Huang, Zhijia Zhao, Bihuan Chen, Susheng Wu +4 more

This paper introduces a component-centric framework and a novel detector, Connor, to understand and detect sophisticated, multi-component attacks targeting the Model Context Protocol (MCP) servers.

View →

cs.CRRecentApr 3, 2026

Design and Implementation of an Open-Source Security Framework for Cloud Infrastructure

Wanru Shao

The paper introduces an open-source security framework that significantly improves cloud infrastructure security assessment by unifying identity and resource data, reducing false positives, and automa…

View →

cs.CRcs.AIRecentApr 2, 2026

Seclens: Role-specific Evaluation of LLM's for security vulnerablity detection

Subho Halder, Siddharth Saxena, Kashinath Kadaba Shrish, Thiyagarajan M

The paper introduces SecLens-R, a multi-stakeholder evaluation framework, demonstrating that LLM performance for vulnerability detection varies significantly depending on the specific priorities (e.g.…

View →

cs.CRcs.SCRecentMay 25, 2026

Heimdall: Formally Verified Automated Migration of Legacy eBPF Programs to Rust

Vishnu Asutosh Dasu, Monika Santra, Md Rafi Ur Rashid, Ashish Kumar +2 more

The paper introduces Heimdall, an automated pipeline that uses LLMs and formal verification to safely and automatically migrate legacy, potentially buggy eBPF programs written in C to memory-safe Rust…

View →

cs.CRcs.SERecentMay 4, 2026

A Validated Prompt Bank for Malicious Code Generation: Separating Executable Weapons from Security Knowledge in 1,554 Consensus-Labeled Prompts

Richard J. Young, Gregory D. Moody

The paper introduces a validated, consensus-labeled prompt bank that separates requests for executable malicious code (weapons) from requests for general harmful security knowledge, providing a more g…

View →

cs.CRcs.MARecentJun 4, 2026

ZERO-APT: A Closed-Loop Adversarial Framework for LLM-Driven Automated Penetration Testing under Intelligent Defense

Anlan Zheng, Tiantian Zhu

ZERO-APT introduces a novel closed-loop adversarial framework for automated penetration testing that simulates attacks against an intelligent, real-time defending system, achieving a high attack succe…

View →

cs.CRcs.AIRecentMay 28, 2026

How Reliable Are AI Attackers Against a Fixed Vulnerable Target? A 400-Run Empirical Study of LLM Penetration Testing Consistency

Galip Tolga Erdem

This study empirically measures the consistency and success rate of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation capabilit…

View →