~ similar to 2604.06274v1· 20 results
The paper proposes an organization-scoped LLM agent runtime architecture designed to provide an auditable, model-agnostic platform for regulated cybersecurity operations, integrating deeply with exist…
The paper proposes a novel, organization-scoped LLM agent runtime architecture designed specifically for regulated cybersecurity operations, ensuring auditable context and integration with existing se…
Analyzing Reddit discussions, the paper finds that while security practitioners see LLMs as useful for boosting productivity, their adoption is constrained by concerns over reliability, verification,…
The paper introduces PROPARAG, an automated framework that autonomously assesses how well organizational cybersecurity policies comply with standard security controls, achieving high F1 scores on real…
The study compared the cybersecurity risk assessment capabilities of five popular large language models (LLMs) against human experts, finding that LLMs consistently underestimated risks and require ma…
The paper proposes an end-to-end LLM framework that automates SOC operations by integrating ensemble-based threat detection, syntax-constrained query generation, and evidence-grounded incident resolut…
The paper introduces CyberCertBench, a new benchmark suite for evaluating LLMs against industry cybersecurity certifications, finding that while frontier models perform well on general knowledge, thei…
The paper introduces the CAI Dataset, a massive, multi-terabyte corpus of real-world, hands-on cybersecurity LLM trajectories, designed to address the performance bottleneck caused by expert operator…
Ayush Garg, Sophia Hager, Jacob Montiel, Aditya Tiwari +4 more
RuleForge is an automated system that generates and validates detection rules for web vulnerabilities from structured CVE templates, significantly improving detection accuracy and reducing false posit…
The paper demonstrates that adopting LLM-based tools in cybersecurity operations requires a sociotechnical, practitioner-centered co-creation approach, which successfully overcame historical adoption…
CyBOKClaw is an interpretable human-in-the-loop retrieval framework designed to map broad cybersecurity keywords to the Cyber Security Body of Knowledge (CyBOK), achieving high expert-guided mapping a…
Safayat Bin Hakim, Aniqa Afzal, Qi Zhao, Vigna Majmundar +2 more
CyberCane is a neuro-symbolic framework that enhances phishing detection by combining symbolic rule analysis with privacy-preserving RAG and formal ontology reasoning, achieving high recall against AI…
The paper proposes CyberAId, a hybrid multi-agent system designed to enhance cybersecurity for financial institutions by integrating specialized LLM subagents with existing SIEM/XDR telemetry, address…
Qian'ang Mao, Jiaxin Wang, Ya Liu, Li Zhu +2 more
The paper develops a unified, cross-layer security framework for autonomous LLM agents operating in agentic commerce, identifying key attack vectors and proposing a layered defense architecture.
The paper introduces CritBench, a novel framework to evaluate LLM cybersecurity capabilities specifically within IEC 61850 Digital Substation Operational Technology (OT) environments, finding that whi…
The paper introduces ASTRAL, a multimodal LLM-driven framework that reconstructs and analyzes fragmented cyber-physical system architectures to enable comprehensive and quantitative security risk asse…
SOCpilot is a system that verifies the compliance of LLM-drafted incident response plans against mandatory policies and required procedural steps, significantly improving the reliability of AI-assiste…
This paper provides a systematic regulatory mapping and compliance architecture for AI agents operating under the complex web of EU laws, concluding that high-risk agents with untraceable behavioral d…
Rishikesh Sahay, Bell Eapen, Weizhi Meng, Md Rasel Al Mamun +4 more
The paper proposes an automated, LLM-enabled threat hunting framework integrated with Splunk to help SOC analysts autonomously monitor evolving threats and prioritize suspicious network traffic.
This paper introduces UPAttack, a novel threat model demonstrating that focusing on explicit usability requirements can cause LLMs to generate insecure code by neglecting implicit security constraints…