~ similar to 2604.20211v1· 20 results
This paper systematically evaluates modern security logging standards (CIM, OCSF, ECS) using a novel framework to quantify their detection efficacy across diverse exploit scenarios, revealing critical…
The paper empirically evaluates the security quality of LLM-generated code across various prompting methods, finding that while prompting alters the structure of weaknesses, it is insufficient to reli…
The paper introduces Sieve, a system that uses a large language model (LLM) to generate executable query code from natural language security questions, significantly improving the ability to perform c…
Fariha Tanjim Shifat, Hariswar Baburaj, Ce Zhou, Jaydeb Sarker +1 more
The paper analyzes GitHub security advisories for LLM-integrated open-source systems, finding that while most vulnerabilities map to existing code-level weaknesses, the architectural risks like Supply…
Houjun Liu, Lisa Einstein, John Yang, Joachim Baumann +4 more
SecureForge is an automated pipeline that significantly reduces cybersecurity vulnerabilities in LLM-generated code by optimizing system prompts, achieving up to a 48% reduction in output vulnerabilit…
The paper demonstrates that relying on strict regular-expression parsing for evaluating LLM-based security log classifiers introduces systematic errors, potentially causing a functional model to appea…
The paper introduces an automated framework demonstrating that LLM system instructions are vulnerable to encoding attacks, where structured output requests can bypass safety refusals and leak sensitiv…
The paper introduces a systematic benchmark to test LLMs' ability to recover Indicators of Compromise (IoCs) from JavaScript code, finding that while LLMs handle simple obfuscation well, encryption-ba…
The paper introduces a novel, large-scale dataset of vulnerable code snippets linked to CAPEC and CWE, generated using advanced LLMs, to improve automatic vulnerability detection.
The paper introduces codebadger, a Model Context Protocol (MCP) server that integrates Joern's Code Property Graph (CPG) with LLMs, enabling large language models to perform large-scale, semantic prog…
Parteek Jamwal, Minghao Shao, Boyuan Chen, Achyuta Muthuvelan +14 more
The paper introduces RAVEN, a Retrieval-Augmented Vulnerability Exploration Network, which uses LLM agents and RAG to automatically generate comprehensive, structured vulnerability analysis reports fo…
The paper introduces False Security Confidence (FSC), a new metric to measure the inherent prevalence of security vulnerabilities in code generated by LLMs that are otherwise functionally correct, eve…
Yifei Ge, Zhenpeng Chen, Weisong Sun, Yuchen Chen +6 more
The paper proposes a novel test-driven pipeline that simulates realistic code generation scenarios to detect privacy leaks in LLMs, achieving a 2.56x increase in detected leakage compared to existing…
This study conducts a large-scale longitudinal analysis of CodeQL, finding that while the tool is effective at detecting vulnerabilities, its detection capabilities are not guaranteed to be stable acr…
Yujie Ma, Jialin Rong, Chenxi Yang, Lili Quan +3 more
The paper addresses the gap in understanding real-world LLM-in-the-loop vulnerabilities by creating the LLMCVE dataset and demonstrating that these vulnerabilities are significantly harder to repair t…
The paper introduces LogJack, a benchmark demonstrating that LLM debugging agents consuming cloud logs are highly vulnerable to indirect prompt injection, with some models executing malicious commands…
Samuel Ndichu, Tao Ban, Seiichi Ozawa, Takeshi Takahashi +1 more
NLLog introduces a lightweight system that converts structured security logs into natural language sentences for improved anomaly detection, achieving high performance with low false-positive rates su…
Samuel Ndichu, Tao Ban, Seiichi Ozawa, Takeshi Takahashi +1 more
NLLog is a lightweight pipeline that rewrites system-generated logs into natural language for improved analysis and comprehension.
The paper introduces TLSCheck 2.0, an enhanced memory forensics plugin for Volatility 3, designed to efficiently detect and analyze suspicious TLS callbacks in process memory.
Chengyan Ma, Jieke Shi, Ruidong Han, Ye Liu +2 more
The paper introduces SymTEE, an LLM-assisted symbolic execution framework that detects missing input validation vulnerabilities in TEE applications without needing complex, real TEE setups.