~ similar to 2605.05531v1· 20 results
He Yang Yuan, Xin Wang, Kundi Yao, An Ran Chen +2 more
The paper characterizes logging code security issues and benchmarks LLMs, finding that while LLMs can moderately detect these issues, they struggle significantly with reliably generating correct code…
Pengyu Sun, Qishu Jin, Enhao Huang, Zifeng Kang +3 more
VIPER-MCP is a novel, end-to-end automated framework that detects and dynamically confirms the exploitability of taint-style vulnerabilities in Model Context Protocol (MCP) servers, achieving high-fid…
Leonardo Bitzki, Diego Kreutz, Tiago Heinrich, Douglas Fideles +3 more
NetSecBed is a container-native, scenario-oriented testbed designed to generate reproducible and auditable network traffic evidence and execution artifacts for complex cybersecurity research.
OpenSOC-AI is a lightweight framework that uses parameter-efficient fine-tuning of a small LLM to automate threat classification and severity assessment from raw security logs, significantly improving…
Pei-Yu Tseng, Lan Zhang, ZihDwo Yeh, Xiaoyan Sun +2 more
The paper introduces IOCRegex-gen, an automated LLM-based system that converts Indicators of Compromise (IOCs) into syntactically and semantically correct regular expressions, achieving high accuracy…
Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more
The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these honeypots provide substantially longer and harder-to-detect…
Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more
The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these systems provide substantially longer and harder-to-detect i…
Zirui Chen, Qi Zhan, Jiayuan Zhou, Xing Hu +2 more
This paper conducts a large-scale empirical study demonstrating that Java library exploits can accurately identify affected versions, achieving high recall and precision, and proposes strategies for e…
The paper proposes an end-to-end LLM framework that automates SOC operations by integrating ensemble-based threat detection, syntax-constrained query generation, and evidence-grounded incident resolut…
This paper provides the first longitudinal analysis of log-based detection rule evolution in public repositories, finding that rule changes reflect ongoing operational trade-offs rather than steady co…
The paper introduces a deterministic method to automatically synthesize initial SIEM detection rules (Sigma rules) from attack simulation findings, ensuring full traceability back to the specific orig…
Ayush Garg, Sophia Hager, Jacob Montiel, Aditya Tiwari +4 more
RuleForge is an automated system that generates and validates detection rules for web vulnerabilities from structured CVE templates, significantly improving detection accuracy and reducing false posit…
This study empirically measures the consistency and success rate of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation capabilit…
This study empirically measures the consistency and effectiveness of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation rates am…
This paper analyzes darknet traffic to characterize advanced, AI-assisted bot reconnaissance, finding that modern evasion techniques allow most bot traffic to bypass standard IDS thresholds.
Yiheng Huang, Zhijia Zhao, Bihuan Chen, Susheng Wu +4 more
This paper introduces a component-centric framework and a novel detector, Connor, to understand and detect sophisticated, multi-component attacks targeting the Model Context Protocol (MCP) servers.
The paper introduces False Security Confidence (FSC), a new metric to measure the inherent prevalence of security vulnerabilities in code generated by LLMs that are otherwise functionally correct, eve…
Oliver Jacobsen, Tobias Kirsch, Haya Schulmann, Niklas Vogel +1 more
This paper analyzes RPKI specifications, demonstrating that vague or conflicting requirements in dozens of RFCs cause systemic vulnerabilities in real-world implementations, leading to 61 undocumented…
The paper introduces 'log-substrate prompt injection,' demonstrating that attacker-controlled log fields can be used to manipulate LLM-powered security analysis, with persona hijacking and context man…
The paper introduces an end-to-end framework that not only detects network intrusions using deep learning but also generates actionable, citation-grounded mitigation reports using a Retrieval-Augmente…