~ similar to 2604.01977v1· 20 results
The paper introduces a deterministic method to automatically synthesize initial SIEM detection rules (Sigma rules) from attack simulation findings, ensuring full traceability back to the specific orig…
This paper provides the first longitudinal analysis of log-based detection rule evolution in public repositories, finding that rule changes reflect ongoing operational trade-offs rather than steady co…
This paper systematically surveys adaptive and AI-augmented security testing, concluding that a major gap exists—structural-adaptive fragmentation—where current systems fail to integrate structural pr…
Pengyu Sun, Qishu Jin, Enhao Huang, Zifeng Kang +3 more
VIPER-MCP is a novel, end-to-end automated framework that detects and dynamically confirms the exploitability of taint-style vulnerabilities in Model Context Protocol (MCP) servers, achieving high-fid…
The paper introduces NICE, a declarative framework that uses NixOS to build and automatically validate reproducible environments for demonstrating software vulnerabilities (CVEs), thereby improving th…
Pei-Yu Tseng, Lan Zhang, ZihDwo Yeh, Xiaoyan Sun +2 more
The paper introduces IOCRegex-gen, an automated LLM-based system that converts Indicators of Compromise (IOCs) into syntactically and semantically correct regular expressions, achieving high accuracy…
The paper establishes a standardized security assessment framework and develops a multi-layered defensive system, demonstrating that systematic testing and external defenses are crucial for safe LLM d…
Ming Xu, Hongtai Wang, Yanpei Guo, Zhengmin Yu +4 more
ARuleCon is an agentic framework that autonomously and accurately converts security rules across heterogeneous SIEM platforms, significantly outperforming baseline LLMs in fidelity.
Ting Zhang, Yikun Li, Chengran Yang, Ratnadira Widyasari +14 more
TitanCA presents a novel, multi-agent LLM orchestration framework that significantly improves vulnerability discovery by reducing false positives and identifying numerous zero-day vulnerabilities.
The paper introduces the CAI Dataset, a massive, multi-terabyte corpus of real-world, hands-on cybersecurity LLM trajectories, designed to address the performance bottleneck caused by expert operator…
The paper introduces a novel, large-scale dataset of vulnerable code snippets linked to CAPEC and CWE, generated using advanced LLMs, to improve automatic vulnerability detection.
Houjun Liu, Lisa Einstein, John Yang, Joachim Baumann +4 more
SecureForge is an automated pipeline that significantly reduces cybersecurity vulnerabilities in LLM-generated code by optimizing system prompts, achieving up to a 48% reduction in output vulnerabilit…
FORGE is a multi-agent system that integrates vulnerability exploitation, prioritization, and detection engineering into a single pipeline, achieving high-fidelity, multi-level exploitation and genera…
The paper introduces False Security Confidence (FSC), a new metric to measure the inherent prevalence of security vulnerabilities in code generated by LLMs that are otherwise functionally correct, eve…
Shenao Wang, Xinyi Hou, Zhao Liu, Yanjie Zhao +4 more
This paper introduces Agentic Workflow Injection (AWI), a new class of vulnerability in LLM-powered GitHub Actions, and presents TaintAWI, a novel taint-analysis tool that identifies hundreds of explo…
The paper benchmarks current frontier computer-using agents against hand-crafted attacks, finding that while they are highly safe in browser tasks, this safety does not generalize to other domains lik…
The paper proposes an attestation-aware promotion gate to mitigate supply-chain risks in LLM pipelines by cryptographically verifying and enforcing claims about training and release artifacts before d…
Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more
The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these honeypots provide substantially longer and harder-to-detect…
Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more
The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these systems provide substantially longer and harder-to-detect i…
The paper proposes an end-to-end LLM framework that automates SOC operations by integrating ensemble-based threat detection, syntax-constrained query generation, and evidence-grounded incident resolut…