Papers similar to 2605.05424v1

~ similar to 2605.05424v1· 20 results

cs.CRcs.AIRecentApr 11, 2026

Like a Hammer, It Can Build, It Can Break: Large Language Model Uses, Perceptions, and Adoption in Cybersecurity Operations on Reddit

Souradip Nath, Chih-Yi Huang, Aditi Ganapathi, Kashyap Thimmaraju +2 more

Analyzing Reddit discussions, the paper finds that while security practitioners see LLMs as useful for boosting productivity, their adoption is constrained by concerns over reliability, verification,…

View →

cs.CRcs.AIRecentMay 11, 2026

Threat Modelling using Domain-Adapted Language Models: Empirical Evaluation and Insights

Saba Pourhanifeh, AbdulAziz AbdulGhaffar, Ashraf Matrawy

The paper empirically evaluates domain-adapted and general-purpose LLMs for structured threat modelling (STRIDE on 5G security), finding that domain adaptation and model size do not guarantee reliable…

View →

cs.CRcs.CYecon.GNRecentApr 23, 2026

Mitigate or Fail: How Risk Management Shapes Cybersecurity Competency

Jeffrey T. Gardiner

The paper argues that despite the focus on risk, the cybersecurity profession is structurally trained as a threat-management discipline, leading to poor foundational risk reasoning among professionals…

View →

cs.CRcs.AIRecentMar 17, 2026

Security Assessment and Mitigation Strategies for Large Language Models: A Comprehensive Defensive Framework

Taiwo Onitiju, Iman Vakilinia

The paper establishes a standardized security assessment framework and develops a multi-layered defensive system, demonstrating that systematic testing and external defenses are crucial for safe LLM d…

View →

cs.CRcs.AIRecentApr 7, 2026

Towards the Development of an LLM-Based Methodology for Automated Security Profiling in Compliance with Ukrainian Cybersecurity Regulations

Daniil Shafranskyi, Iryna Stopochkina, Mykola Ilin

The paper proposes an LLM-enhanced methodology using RAG to automate the creation of security profiles, ensuring compliance with Ukrainian cybersecurity regulations and international best practices.

View →

cs.CRRecentMar 24, 2026

Security Barriers to Trustworthy AI-Driven Cyber Threat Intelligence in Finance: Evidence from Practitioners

Emir Karaosman, Advije Rizvani, Irdin Pekaric

This paper investigates the practical barriers preventing the trustworthy deployment of AI-driven Cyber Threat Intelligence (CTI) in the highly regulated financial sector, identifying four key socio-t…

View →

eess.SYcs.AIcs.CRRecentMar 20, 2026

An Agentic Multi-Agent Architecture for Cybersecurity Risk Management

Ravish Gupta, Saket Kumar, Shreeya Sharma, Maulik Dang +1 more

The paper introduces a novel six-agent AI architecture for cybersecurity risk assessment, demonstrating high accuracy and speed compared to human experts, though its performance is ultimately limited…

View →

cs.CRRecentApr 28, 2026

Large Language Models as Explainable Cyberattack Detectors for Energy Industrial Control Systems

Weiyi Kong, Ahmad Mohammad Saber, Amr Youssef, Deepa Kundur

This paper demonstrates that an off-the-shelf Large Language Model (LLM) can function as a high-performing, explainable, human-in-the-loop layer for detecting cyberattacks in Industrial Control System…

View →

cs.CRcs.AIRecentApr 7, 2026

From Incomplete Architecture to Quantified Risk: Multimodal LLM-Driven Security Assessment for Cyber-Physical Systems

Shaofei Huang, Christopher M. Poskitt, Lwin Khin Shar

The paper introduces ASTRAL, a multimodal LLM-driven framework that reconstructs and analyzes fragmented cyber-physical system architectures to enable comprehensive and quantitative security risk asse…

View →

cs.CRcs.AIRecentApr 22, 2026

CyberCertBench: Evaluating LLMs in Cybersecurity Certification Knowledge

Gustav Keppler, Ghada Elbez, Veit Hagenmeyer

The paper introduces CyberCertBench, a new benchmark suite for evaluating LLMs against industry cybersecurity certifications, finding that while frontier models perform well on general knowledge, thei…

View →

cs.CRcs.CERecentApr 10, 2026

Conversations Risk Detection LLMs in Financial Agents via Multi-Stage Generative Rollout

Xiaotong Jiang, Jun Wu

The paper proposes FinSec, a novel four-tier security detection framework, to robustly identify complex financial risks and suspicious dialogue patterns in LLM-powered financial agents, achieving stat…

View →

cs.CRcs.AIRecentMay 17, 2026

Ablating Safety: Mechanisms for Removing Alignment in Language Models for Security Applications

Isaac David, Arthur Gervais

The paper proposes Ablating Safety, a controlled protocol for removing safety alignment from language models, demonstrating that targeted de-alignment can significantly boost security performance whil…

View →

cs.CRRecentApr 23, 2026

A Sociotechnical, Practitioner-Centered Approach to Technology Adoption in Cybersecurity Operations: An LLM Case

Francis Hahn, Mohd Mamoon, Alexandru G. Bardas, Michael Collins +3 more

The paper demonstrates that adopting LLM-based tools in cybersecurity operations requires a sociotechnical, practitioner-centered co-creation approach, which successfully overcame historical adoption…

View →

cs.CRcs.AIcs.CLRecentApr 3, 2026

An Independent Safety Evaluation of Kimi K2.5

Zheng-Xin Yong, Parv Mahajan, Andy Wang, Ida Caspary +11 more

The paper conducts a preliminary safety evaluation of the open-weight LLM Kimi K2.5, finding that while it is highly capable, it exhibits concerning dual-use risks, particularly regarding CBRNE misuse…

View →

cs.CRcs.CVRecentMar 18, 2026

Toward Reliable, Safe, and Secure LLMs for Scientific Applications

Saket Sanjeev Chaturvedi, Joshua Bergerson, Tanwi Mallick

This paper addresses the critical need for trustworthy LLMs in science by proposing a comprehensive, multi-layered defense framework and methodology to evaluate unique scientific vulnerabilities.

View →

cs.CLcs.AIcs.CRRecentMay 13, 2026

ROK-FORTRESS: Measuring the Effect of Geopolitical Transcreation for National Security and Public Safety

Michael S. Lee, Yash Maurya, Drew Rein, Bert Herring +12 more

The paper introduces ROK-FORTRESS, a novel bilingual, culturally adversarial benchmark that demonstrates that LLM safety behavior in high-stakes scenarios is significantly shaped by the interaction be…

View →

cs.CRcs.SERecentMay 4, 2026

A Validated Prompt Bank for Malicious Code Generation: Separating Executable Weapons from Security Knowledge in 1,554 Consensus-Labeled Prompts

Richard J. Young, Gregory D. Moody

The paper introduces a validated, consensus-labeled prompt bank that separates requests for executable malicious code (weapons) from requests for general harmful security knowledge, providing a more g…

View →

cs.CRRecentMay 27, 2026

Cybersecurity AI (CAI) Dataset

Víctor Mayoral-Vilches

The paper introduces the CAI Dataset, a massive, multi-terabyte corpus of real-world, hands-on cybersecurity LLM trajectories, designed to address the performance bottleneck caused by expert operator…

View →

cs.CRRecentMay 8, 2026

An Automated Framework for Cybersecurity Policy Compliance Assessment Against Security Control Standards

Bikash Saha, Sandeep Kumar Shukla

The paper introduces PROPARAG, an automated framework that autonomously assesses how well organizational cybersecurity policies comply with standard security controls, achieving high F1 scores on real…

View →

cs.CRcs.AIRecentMay 16, 2026

STRIDE-AI: A Threat Modeling Framework for Generative AI Security Assessment

Tsafac Nkombong Regine Cyrille, Franziska Schwarz

The paper introduces STRIDE-AI, a novel threat modeling framework that adapts classical STRIDE for generative AI, successfully reducing the attack success rate of a tested LLM chatbot from 80% to 15%.

View →