Papers similar to 2605.24551v1

~ similar to 2605.24551v1· 20 results

cs.CRcs.CYRecentMay 20, 2026

Profiling User Vulnerability to Phishing Through Psychological and Behavioral Factors

Valeria Formisano, Danilo Gentile, Gennaro Esposito Mocerino, Michela Ponticorvo +3 more

This study profiles user vulnerability to phishing by identifying key psychological and behavioral factors, revealing that most users are high-risk due to hasty decision-making rather than lacking tec…

View →

cs.CRRecentMay 21, 2026

Human Vulnerability Assessment in Cybersecurity: A Systematic Literature Review of Methods, Models, and Instruments

Dimitra Papatsaroucha, Stavroula Psaroudaki, Eleftheria Vassilaki, Konstantina Pityanou +3 more

This systematic literature review analyzes existing methods, models, and instruments for assessing human vulnerability in cybersecurity, concluding that current approaches are fragmented and lack a dy…

View →

cs.CRcs.AIRecentMay 28, 2026

How Reliable Are AI Attackers Against a Fixed Vulnerable Target? A 400-Run Empirical Study of LLM Penetration Testing Consistency

Galip Tolga Erdem

This study empirically measures the consistency and success rate of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation capabilit…

View →

cs.CRcs.AIRecentMay 28, 2026

How Reliable Are AI Attackers Against a Fixed Vulnerable Target? A 400-Run Empirical Study of LLM Penetration Testing Consistency

Galip Tolga Erdem

This study empirically measures the consistency and effectiveness of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation rates am…

View →

cs.CRcs.CYecon.GNRecentApr 23, 2026

Mitigate or Fail: How Risk Management Shapes Cybersecurity Competency

Jeffrey T. Gardiner

The paper argues that despite the focus on risk, the cybersecurity profession is structurally trained as a threat-management discipline, leading to poor foundational risk reasoning among professionals…

View →

cs.CRRecentApr 16, 2026

ConGISATA: A Framework for Continuous Gamified Information Security Awareness Training and Assessment

Ofir Cohen, Ron Bitton, Asaf Shabtai, Rami Puzis

The paper proposes ConGISATA, a continuous, gamified framework using embedded mobile sensors to enhance individual information security awareness by transforming passive risks into active learning opp…

View →

cs.CRRecentJun 4, 2026

Exploring the connection between coding habits and cognitive styles in malware developers

Vasilis Vouvoutsis, Constantinos Patsakis, Fran Casino

The study analyzes coding patterns in malware versus benign software, finding that malware code is optimized for quick evasion and secrecy rather than maintainability, though its metrics are not uniqu…

View →

cs.CRcs.AIRecentMay 8, 2026

CyBiasBench: Benchmarking Bias in LLM Agents for Cyber-Attack Scenarios

Taein Lim, Seongyong Ju, Munhyeok Kim, Hyunjun Kim +1 more

The paper introduces CyBiasBench, a comprehensive benchmark that quantifies the inherent, agent-specific bias in LLM agents' attack selection patterns in cybersecurity scenarios.

View →

cs.CRcs.AIRecentMay 22, 2026

Are Frontier LLMs Ready for Cybersecurity? Evidence for Vertical Foundation Models from Dual-Mode Vulnerability Benchmarks

Vivek Dahiya, Sunny Nehra, Vipul Dholariya, Bhavik Shangari +1 more

The paper evaluates frontier LLMs on cybersecurity tasks using dual-mode benchmarks and concludes that general-purpose models are insufficient, advocating for specialized, vertical foundation models.

View →

cs.CRcs.AIcs.CLRecentApr 7, 2026

Swiss-Bench 003: Evaluating LLM Reliability and Adversarial Security for Swiss Regulatory Contexts

Fatih Uenal

This paper introduces Swiss-Bench 003, an expanded evaluation framework assessing LLM reliability and adversarial security across eight dimensions using 808 Swiss-specific items, revealing that self-g…

View →

cs.CRcs.AIcs.CLRecentApr 3, 2026

An Independent Safety Evaluation of Kimi K2.5

Zheng-Xin Yong, Parv Mahajan, Andy Wang, Ida Caspary +11 more

The paper conducts a preliminary safety evaluation of the open-weight LLM Kimi K2.5, finding that while it is highly capable, it exhibits concerning dual-use risks, particularly regarding CBRNE misuse…

View →

cs.CRRecentMay 26, 2026

Assessor Experiences in CMMC Level 2 Certification Assessments: An Interpretative Phenomenological Analysis of Role Expectations

Samuel Heuchert, John Hastings

This study explores how CMMC assessors navigate the conflicting role expectations of maintaining impartiality within a non-consultative assessment model, finding that they rely on technical competence…

View →

cs.CRRecentApr 27, 2026

ARCANE: Cross-Campaign Attacker Re-identification via Passive Beacon Telemetry -- A Bayesian Network Framework for Longitudinal Cyber Attribution

Abraham Itzhak Weinberg

The paper introduces ARCANE, a Bayesian network framework for cross-campaign cyber attribution, finding that while aggregating telemetry improves identification, structural feature limitations prevent…

View →

cs.CRcs.AIRecentApr 3, 2026

Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis

Zhiyuan Li, Jingzheng Wu, Xiang Ling, Xing Cui +1 more

This paper provides the first comprehensive security analysis of the Agent Skills framework, identifying severe structural vulnerabilities that require fundamental architectural changes rather than si…

View →

cs.CLcs.AIcs.CRRecentMar 31, 2026

Can LLMs Infer Conversational Agent Users' Personality Traits from Chat History?

Derya Cögendez, Verena Zimmermann, Noé Zufferey

This study quantifies the privacy risk of inferring sensitive personality traits from user interactions with LLM-based conversational agents, demonstrating that machine learning models can accurately…

View →

cs.AIcs.CRRecentMay 22, 2026

Safety-Oriented Routing Analysis of Mixtral MoE Under Benign and Harmful Prompts

Md Nurul Absar Siddiky

The paper analyzes the routing behavior of Mixtral MoE under benign and harmful prompts using activation and gradient signals, finding that safety-relevant routing is subtle, depth-dependent, and dist…

View →

cs.CRcs.AIRecentApr 21, 2026

Cyber Defense Benchmark: Agentic Threat Hunting Evaluation for LLMs in SecOps

Alankrit Chona, Igor Kozlov, Ambuj Kumar

The paper introduces a challenging benchmark for LLM agents to perform unsupervised threat hunting on raw Windows event logs, finding that current frontier models perform poorly and are not ready for…

View →

cs.CRRecentMay 13, 2026

Do Skill Descriptions Tell the Truth? Detecting Undisclosed Security Behaviors in Code-Backed LLM Skills

Wenhui He, Yue Li, Bang Fu, Huan Xing +3 more

The paper introduces SKILLSCOPE, a system that detects security-relevant behaviors in code-backed LLM skills that are not disclosed in the natural language description, finding that 9.4% of skills exh…

View →

cs.CRcs.AIcs.CLRecentJun 3, 2026

Domain-Conditioned Safety in Frontier Computer-Using Agents: A 793-Episode Browser Benchmark, a Coding-Domain Cross-Reference, and a Reproducibility Audit of Recent Red-Teaming

Nicholas Saban

The paper benchmarks current frontier computer-using agents against hand-crafted attacks, finding that while they are highly safe in browser tasks, this safety does not generalize to other domains lik…

View →

eess.SYcs.AIcs.CRRecentMar 20, 2026

An Agentic Multi-Agent Architecture for Cybersecurity Risk Management

Ravish Gupta, Saket Kumar, Shreeya Sharma, Maulik Dang +1 more

The paper introduces a novel six-agent AI architecture for cybersecurity risk assessment, demonstrating high accuracy and speed compared to human experts, though its performance is ultimately limited…

View →