Papers similar to 2604.20211v1

~ similar to 2604.20211v1· 20 results

cs.CRRecentMay 7, 2026

Beyond Collection: Measuring the Detection Efficacy of Modern Security Logging Standards

Ryan Holeman, John Hastings, Varghese Mathew Vaidyan

This paper systematically evaluates modern security logging standards (CIM, OCSF, ECS) using a novel framework to quantify their detection efficacy across diverse exploit scenarios, revealing critical…

View →

cs.CRcs.AIcs.LGRecentMay 22, 2026

An Empirical Evaluation of LLM-Generated Code Security Across Prompting Methods

Mohammed Kharma, Ahmed Sabbah, Mohammad Alkhanafseh, Mohammad Hammoudeh +1 more

The paper empirically evaluates the security quality of LLM-generated code across various prompting methods, finding that while prompting alters the structure of weaknesses, it is insufficient to reli…

View →

cs.CRRecentMay 21, 2026

Parser-Free Querying of Security Logs

Evan Luo, Julien Piet, David Wagner

The paper introduces Sieve, a system that uses a large language model (LLM) to generate executable query code from natural language security questions, significantly improving the ability to perform c…

View →

cs.CRcs.SERecentApr 5, 2026

LLM-Enabled Open-Source Systems in the Wild: An Empirical Study of Vulnerabilities in GitHub Security Advisories

Fariha Tanjim Shifat, Hariswar Baburaj, Ce Zhou, Jaydeb Sarker +1 more

The paper analyzes GitHub security advisories for LLM-integrated open-source systems, finding that while most vulnerabilities map to existing code-level weaknesses, the architectural risks like Supply…

View →

cs.CRcs.CLcs.CYRecentMay 8, 2026

SecureForge: Finding and Preventing Vulnerabilities in LLM-Generated Code via Prompt Optimization

Houjun Liu, Lisa Einstein, John Yang, Joachim Baumann +4 more

SecureForge is an automated pipeline that significantly reduces cybersecurity vulnerabilities in LLM-generated code by optimizing system prompts, achieving up to a 48% reduction in output vulnerabilit…

View →

cs.CRRecentMay 8, 2026

When the Ruler is Broken: Parsing-Induced Suppression in LLM-Based Security Log Evaluation

Chaitanya Vilas Garware, Sharif Noor Zisad

The paper demonstrates that relying on strict regular-expression parsing for evaluating LLM-based security log classifiers introduces systematic errors, potentially causing a functional model to appea…

View →

cs.CRcs.AIRecentApr 1, 2026

Automated Framework to Evaluate and Harden LLM System Instructions against Encoding Attacks

Anubhab Sahu, Diptisha Samanta, Reza Soosahabi

The paper introduces an automated framework demonstrating that LLM system instructions are vulnerable to encoding attacks, where structured output requests can bypass safety refusals and leak sensitiv…

View →

cs.CRRecentMay 7, 2026

Benchmarking Large Language Models for IoC Recovery under Adversarial Code Obfuscation and Encryption

Jaime Morales, Sergio Pastrana, Juan Tapiador

The paper introduces a systematic benchmark to test LLMs' ability to recover Indicators of Compromise (IoCs) from JavaScript code, finding that while LLMs handle simple obfuscation well, encryption-ba…

View →

cs.CRcs.AIRecentApr 2, 2026

From Theory to Practice: Code Generation Using LLMs for CAPEC and CWE Frameworks

Murtuza Shahzad, Joseph Wilson, Ibrahim Al Azher, Hamed Alhoori +1 more

The paper introduces a novel, large-scale dataset of vulnerable code snippets linked to CAPEC and CWE, generated using advanced LLMs, to improve automatic vulnerability detection.

View →

cs.CRRecentMar 25, 2026

Bridging Code Property Graphs and Language Models for Program Analysis

Ahmed Lekssays

The paper introduces codebadger, a Model Context Protocol (MCP) server that integrates Joern's Code Property Graph (CPG) with LLMs, enabling large language models to perform large-scale, semantic prog…

View →

cs.CRcs.AIcs.MARecentApr 20, 2026

RAVEN: Retrieval-Augmented Vulnerability Exploration Network for Memory Corruption Analysis in User Code and Binary Programs

Parteek Jamwal, Minghao Shao, Boyuan Chen, Achyuta Muthuvelan +14 more

The paper introduces RAVEN, a Retrieval-Augmented Vulnerability Exploration Network, which uses LLM agents and RAG to automatically generate comprehensive, structured vulnerability analysis reports fo…

View →

cs.CRRecentApr 18, 2026

False Security Confidence in Benign LLM Code Generation

Xiaolei Ren

The paper introduces False Security Confidence (FSC), a new metric to measure the inherent prevalence of security vulnerabilities in code generated by LLMs that are otherwise functionally correct, eve…

View →

cs.SEcs.CRRecentMay 14, 2026

Probing Privacy Leaks in LLM-based Code Generation via Test Generation

Yifei Ge, Zhenpeng Chen, Weisong Sun, Yuchen Chen +6 more

The paper proposes a novel test-driven pipeline that simulates realistic code generation scenarios to detect privacy leaks in LLMs, achieving a 2.56x increase in detected leakage compared to existing…

View →

cs.CRRecentMay 8, 2026

Longitudinal Analyses of SAST Tools: A CodeQL Case Study

Jean-Charles Noirot Ferrand, Kyle Domico, Yohan Beugin, Patrick McDaniel

This study conducts a large-scale longitudinal analysis of CodeQL, finding that while the tool is effective at detecting vulnerabilities, its detection capabilities are not guaranteed to be stable acr…

View →

cs.SEcs.CRRecentMay 27, 2026

Towards Demystifying and Repairing LLM-in-the-Loop Vulnerabilities

Yujie Ma, Jialin Rong, Chenxi Yang, Lili Quan +3 more

The paper addresses the gap in understanding real-world LLM-in-the-loop vulnerabilities by creating the LLMCVE dataset and demonstrating that these vulnerabilities are significantly harder to repair t…

View →

cs.CRRecentApr 15, 2026

LogJack: Indirect Prompt Injection Through Cloud Logs Against LLM Debugging Agents

Harsh Shah

The paper introduces LogJack, a benchmark demonstrating that LLM debugging agents consuming cloud logs are highly vulnerable to indirect prompt injection, with some models executing malicious commands…

View →

cs.CRcs.IRcs.LGRecentJun 3, 2026

NLLog: Lightweight, Explainable SOC Anomaly Detection via Log-to-Language Rewriting

Samuel Ndichu, Tao Ban, Seiichi Ozawa, Takeshi Takahashi +1 more

NLLog introduces a lightweight system that converts structured security logs into natural language sentences for improved anomaly detection, achieving high performance with low false-positive rates su…

View →

cs.CRcs.IRcs.LGRecentJun 3, 2026

NLLog: Lightweight, Explainable SOC Anomaly Detection via Log-to-Language Rewriting

Samuel Ndichu, Tao Ban, Seiichi Ozawa, Takeshi Takahashi +1 more

NLLog is a lightweight pipeline that rewrites system-generated logs into natural language for improved analysis and comprehension.

View →

cs.CRRecentApr 22, 2026

TLSCheck 2.0: An Enhanced Memory Forensics Approach to Efficiently Detect TLS Callbacks

Kartik N. Iyer, Parag H. Rughani

The paper introduces TLSCheck 2.0, an enhanced memory forensics plugin for Volatility 3, designed to efficiently detect and analyze suspicious TLS callbacks in process memory.

View →

cs.SEcs.CRRecentMay 21, 2026

Finding Missing Input Validation in TEEs via LLM-Assisted Symbolic Execution

Chengyan Ma, Jieke Shi, Ruidong Han, Ye Liu +2 more

The paper introduces SymTEE, an LLM-assisted symbolic execution framework that detects missing input validation vulnerabilities in TEE applications without needing complex, real TEE setups.

View →