~ similar to 2605.24551v1· 20 results
This study profiles user vulnerability to phishing by identifying key psychological and behavioral factors, revealing that most users are high-risk due to hasty decision-making rather than lacking tec…
This systematic literature review analyzes existing methods, models, and instruments for assessing human vulnerability in cybersecurity, concluding that current approaches are fragmented and lack a dy…
This study empirically measures the consistency and success rate of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation capabilit…
This study empirically measures the consistency and effectiveness of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation rates am…
The paper argues that despite the focus on risk, the cybersecurity profession is structurally trained as a threat-management discipline, leading to poor foundational risk reasoning among professionals…
The paper proposes ConGISATA, a continuous, gamified framework using embedded mobile sensors to enhance individual information security awareness by transforming passive risks into active learning opp…
The study analyzes coding patterns in malware versus benign software, finding that malware code is optimized for quick evasion and secrecy rather than maintainability, though its metrics are not uniqu…
Taein Lim, Seongyong Ju, Munhyeok Kim, Hyunjun Kim +1 more
The paper introduces CyBiasBench, a comprehensive benchmark that quantifies the inherent, agent-specific bias in LLM agents' attack selection patterns in cybersecurity scenarios.
Vivek Dahiya, Sunny Nehra, Vipul Dholariya, Bhavik Shangari +1 more
The paper evaluates frontier LLMs on cybersecurity tasks using dual-mode benchmarks and concludes that general-purpose models are insufficient, advocating for specialized, vertical foundation models.
This paper introduces Swiss-Bench 003, an expanded evaluation framework assessing LLM reliability and adversarial security across eight dimensions using 808 Swiss-specific items, revealing that self-g…
Zheng-Xin Yong, Parv Mahajan, Andy Wang, Ida Caspary +11 more
The paper conducts a preliminary safety evaluation of the open-weight LLM Kimi K2.5, finding that while it is highly capable, it exhibits concerning dual-use risks, particularly regarding CBRNE misuse…
This study explores how CMMC assessors navigate the conflicting role expectations of maintaining impartiality within a non-consultative assessment model, finding that they rely on technical competence…
The paper introduces ARCANE, a Bayesian network framework for cross-campaign cyber attribution, finding that while aggregating telemetry improves identification, structural feature limitations prevent…
Zhiyuan Li, Jingzheng Wu, Xiang Ling, Xing Cui +1 more
This paper provides the first comprehensive security analysis of the Agent Skills framework, identifying severe structural vulnerabilities that require fundamental architectural changes rather than si…
This study quantifies the privacy risk of inferring sensitive personality traits from user interactions with LLM-based conversational agents, demonstrating that machine learning models can accurately…
The paper analyzes the routing behavior of Mixtral MoE under benign and harmful prompts using activation and gradient signals, finding that safety-relevant routing is subtle, depth-dependent, and dist…
The paper introduces a challenging benchmark for LLM agents to perform unsupervised threat hunting on raw Windows event logs, finding that current frontier models perform poorly and are not ready for…
The paper introduces SKILLSCOPE, a system that detects security-relevant behaviors in code-backed LLM skills that are not disclosed in the natural language description, finding that 9.4% of skills exh…
The paper benchmarks current frontier computer-using agents against hand-crafted attacks, finding that while they are highly safe in browser tasks, this safety does not generalize to other domains lik…
Ravish Gupta, Saket Kumar, Shreeya Sharma, Maulik Dang +1 more
The paper introduces a novel six-agent AI architecture for cybersecurity risk assessment, demonstrating high accuracy and speed compared to human experts, though its performance is ultimately limited…