ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2604.26217v1· 20 results

cs.CRcs.AIcs.IRRecentApr 30, 2026

Toward Autonomous SOC Operations: End-to-End LLM Framework for Threat Detection, Query Generation, and Resolution in Security Operations

Md Hasan Saju, Akramul Azim

The paper proposes an end-to-end LLM framework that automates SOC operations by integrating ensemble-based threat detection, syntax-constrained query generation, and evidence-grounded incident resolut…

View →
cs.CRcs.AIRecentMay 11, 2026

Threat Modelling using Domain-Adapted Language Models: Empirical Evaluation and Insights

Saba Pourhanifeh, AbdulAziz AbdulGhaffar, Ashraf Matrawy

The paper empirically evaluates domain-adapted and general-purpose LLMs for structured threat modelling (STRIDE on 5G security), finding that domain adaptation and model size do not guarantee reliable…

View →
cs.CRRecentMay 8, 2026

When the Ruler is Broken: Parsing-Induced Suppression in LLM-Based Security Log Evaluation

Chaitanya Vilas Garware, Sharif Noor Zisad

The paper demonstrates that relying on strict regular-expression parsing for evaluating LLM-based security log classifiers introduces systematic errors, potentially causing a functional model to appea…

View →
cs.CRRecentMay 27, 2026

Cybersecurity AI (CAI) Dataset

Víctor Mayoral-Vilches

The paper introduces the CAI Dataset, a massive, multi-terabyte corpus of real-world, hands-on cybersecurity LLM trajectories, designed to address the performance bottleneck caused by expert operator…

View →
cs.CRRecentMay 21, 2026

Parser-Free Querying of Security Logs

Evan Luo, Julien Piet, David Wagner

The paper introduces Sieve, a system that uses a large language model (LLM) to generate executable query code from natural language security questions, significantly improving the ability to perform c…

View →
cs.CRcs.AIRecentApr 21, 2026

Cyber Defense Benchmark: Agentic Threat Hunting Evaluation for LLMs in SecOps

Alankrit Chona, Igor Kozlov, Ambuj Kumar

The paper introduces a challenging benchmark for LLM agents to perform unsupervised threat hunting on raw Windows event logs, finding that current frontier models perform poorly and are not ready for…

View →
cs.CRcs.AIRecentMar 25, 2026

Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage

Rishikesh Sahay, Bell Eapen, Weizhi Meng, Md Rasel Al Mamun +4 more

The paper proposes an automated, LLM-enabled threat hunting framework integrated with Splunk to help SOC analysts autonomously monitor evolving threats and prioritize suspicious network traffic.

View →
cs.CRcs.LGRecentMay 23, 2026

Poisoning the Watchtower: Prompt Injection Attacks Against LLM-Augmented Security Operations Through Adversarial Log Content

Rohan Pandey, Archit Bhujang

The paper introduces 'log-substrate prompt injection,' demonstrating that attacker-controlled log fields can be used to manipulate LLM-powered security analysis, with persona hijacking and context man…

View →
cs.CRRecentMar 23, 2026

Semi-Automated Threat Modeling of Cloud-Based Systems Through Extracting Software Architecture from Configuration and Network Flow

Nicholas Pecka, Lotfi Ben Othmane, Bharat Bhargava, Renee Bryce

The paper proposes a novel semi-automated method to perform continuous threat modeling by inferring the actual system architecture from combined static configuration and dynamic network flow data, sig…

View →
cs.CRcs.IRcs.LGRecentJun 3, 2026

NLLog: Lightweight, Explainable SOC Anomaly Detection via Log-to-Language Rewriting

Samuel Ndichu, Tao Ban, Seiichi Ozawa, Takeshi Takahashi +1 more

NLLog introduces a lightweight system that converts structured security logs into natural language sentences for improved anomaly detection, achieving high performance with low false-positive rates su…

View →
cs.CRcs.IRcs.LGRecentJun 3, 2026

NLLog: Lightweight, Explainable SOC Anomaly Detection via Log-to-Language Rewriting

Samuel Ndichu, Tao Ban, Seiichi Ozawa, Takeshi Takahashi +1 more

NLLog is a lightweight pipeline that rewrites system-generated logs into natural language for improved analysis and comprehension.

View →
cs.AIcs.CRRecentMay 11, 2026

MATRA: Modeling the Attack Surface of Agentic AI Systems -- OpenClaw Case Study

Tim Van hamme, Thomas Vissers, Javier Carnerero-Cano, Mario Fritz +3 more

The paper introduces MATRA, a systematic threat modeling framework, to assess how known LLM threats translate into concrete, deployment-specific risks within autonomous agentic AI systems.

View →
cs.CRcs.AIRecentMar 29, 2026

A Security Analysis of the OpenClaw AI Agent Framework

Surada Suwansathit, Yuxuan Zhang, Guofei Gu

This paper analyzes 470 security advisories in the OpenClaw AI agent framework, demonstrating that the system's structural weakness lies in per-layer trust enforcement, enabling cross-layer remote cod…

View →
cs.CRcs.AIRecentApr 7, 2026

LanG -- A Governance-Aware Agentic AI Platform for Unified Security Operations

Anes Abdennebi, Nadjia Kara, Laaziz Lahlou, Hakima Ould-Slimane

LanG is a governance-aware, open-source agentic AI platform that unifies security operations by providing advanced correlation, automated rule generation, and attack reconstruction capabilities.

View →
cs.CRRecentApr 14, 2026

From IOCs to Regex: Automating CTI Operationalization for SOC with LLMs

Pei-Yu Tseng, Lan Zhang, ZihDwo Yeh, Xiaoyan Sun +2 more

The paper introduces IOCRegex-gen, an automated LLM-based system that converts Indicators of Compromise (IOCs) into syntactically and semantically correct regular expressions, achieving high accuracy…

View →
cs.CRcs.AIRecentApr 3, 2026

A Systematic Security Evaluation of OpenClaw and Its Variants

Yuhang Wang, Haichang Gao, Zhenxing Niu, Zhaoxiang Liu +3 more

The paper systematically evaluates six OpenClaw-series AI agent frameworks, demonstrating that these agentized systems possess significant security vulnerabilities that are distinct from and more seve…

View →
cs.CRcs.CLRecentMay 14, 2026

Talk is (Not) Cheap: A Taxonomy and Benchmark Coverage Audit for LLM Attacks

Karthik Raghu Iyer, Yazdan Jamshidi, Nicholas Bray, Alexey A. Shvets

The paper introduces a comprehensive taxonomy and auditing framework to assess the collective coverage of existing LLM attack benchmarks, revealing significant and systematic gaps in current testing m…

View →
cs.CRRecentMay 7, 2026

Beyond Collection: Measuring the Detection Efficacy of Modern Security Logging Standards

Ryan Holeman, John Hastings, Varghese Mathew Vaidyan

This paper systematically evaluates modern security logging standards (CIM, OCSF, ECS) using a novel framework to quantify their detection efficacy across diverse exploit scenarios, revealing critical…

View →
cs.CRcs.LGRecentMay 20, 2026

HIDBench: Benchmarking Large Language Models for Host-Based Intrusion Detection

Danyu Sun, Jinghuai Zhang, Yuan Tian, Zhou Li

The paper introduces HIDBench, a new benchmark for evaluating LLMs' ability to perform host-based intrusion detection using complex, noisy system logs, finding that model performance degrades signific…

View →
cs.CRcs.AIRecentApr 22, 2026

CyberCertBench: Evaluating LLMs in Cybersecurity Certification Knowledge

Gustav Keppler, Ghada Elbez, Veit Hagenmeyer

The paper introduces CyberCertBench, a new benchmark suite for evaluating LLMs against industry cybersecurity certifications, finding that while frontier models perform well on general knowledge, thei…

View →