~ similar to 2604.21679v1· 20 results
Analyzing Reddit discussions, the paper finds that while security practitioners see LLMs as useful for boosting productivity, their adoption is constrained by concerns over reliability, verification,…
The paper proposes an end-to-end LLM framework that automates SOC operations by integrating ensemble-based threat detection, syntax-constrained query generation, and evidence-grounded incident resolut…
The paper proposes an organization-scoped LLM agent runtime architecture designed to provide an auditable, model-agnostic platform for regulated cybersecurity operations, integrating deeply with exist…
The paper proposes a novel, organization-scoped LLM agent runtime architecture designed specifically for regulated cybersecurity operations, ensuring auditable context and integration with existing se…
The study compared the cybersecurity risk assessment capabilities of five popular large language models (LLMs) against human experts, finding that LLMs consistently underestimated risks and require ma…
The paper argues that LLM agent security is fundamentally an agent-human interaction (AHI) problem, demonstrating that industry practices rely on human-centric mechanisms while academic research focus…
Minfeng Qi, Tianqing Zhu, Zijie Xu, Congcong Zhu +2 more
The paper introduces CAESAR, a novel multi-agent framework that coordinates LLM agents across five specialized roles to improve success rates and stability in complex, multi-stage cyber intrusion task…
The paper introduces LITE-SOC, a lightweight, web-based simulator designed to provide a practical, accessible alternative for teaching cybersecurity SOC workflows without requiring complex, expensive…
This pilot study investigates SME readiness for Zero Trust Architecture (ZTA) and proposes a realistic three-stage adoption path based on survey data from IT professionals.
SOCpilot is a system that verifies the compliance of LLM-drafted incident response plans against mandatory policies and required procedural steps, significantly improving the reliability of AI-assiste…
The paper empirically evaluates domain-adapted and general-purpose LLMs for structured threat modelling (STRIDE on 5G security), finding that domain adaptation and model size do not guarantee reliable…
Muhammad Bilal, Jon Crowcroft, Ruizhi Wang, Xiaolong Xu +1 more
The paper surveys the use of LLMs for agentic NetOps and AIOps, arguing that operational reliability depends not on the model itself, but on robust surrounding machinery and workflow-centered evaluati…
Rishikesh Sahay, Bell Eapen, Weizhi Meng, Md Rasel Al Mamun +4 more
The paper proposes an automated, LLM-enabled threat hunting framework integrated with Splunk to help SOC analysts autonomously monitor evolving threats and prioritize suspicious network traffic.
The paper introduces ASTRAL, a multimodal LLM-driven framework that reconstructs and analyzes fragmented cyber-physical system architectures to enable comprehensive and quantitative security risk asse…
The paper introduces CyberCertBench, a new benchmark suite for evaluating LLMs against industry cybersecurity certifications, finding that while frontier models perform well on general knowledge, thei…
The paper proposes an LLM-enhanced methodology using RAG to automate the creation of security profiles, ensuring compliance with Ukrainian cybersecurity regulations and international best practices.
Jiaren Peng, Zeqin Li, Chang You, Yan Wang +16 more
This paper provides the first comprehensive systematization and large-scale empirical evaluation of existing LLM-based Automated Penetration Testing (AutoPT) frameworks, offering a structured taxonomy…
The paper proposes a Semantic Gateway and a Zero-Trust security model to formally validate and secure autonomous AI agents operating in enterprise systems, achieving a 100% discovery rate of unauthori…
Ravish Gupta, Saket Kumar, Shreeya Sharma, Maulik Dang +1 more
The paper introduces a novel six-agent AI architecture for cybersecurity risk assessment, demonstrating high accuracy and speed compared to human experts, though its performance is ultimately limited…
The paper introduces the CAI Dataset, a massive, multi-terabyte corpus of real-world, hands-on cybersecurity LLM trajectories, designed to address the performance bottleneck caused by expert operator…