~ similar to 2604.09153v1· 20 results
The paper proposes a novel nine-dimension risk assessment framework for institutional DeFi adoption, significantly enhancing existing methodologies by incorporating novel dimensions like composability…
This paper evaluates and compares HAZOP and Bow-Tie analysis, demonstrating that while both are useful for cyber risk assessment in hydropower, a coordinated adversary can bypass conventional safeguar…
The paper argues that despite the focus on risk, the cybersecurity profession is structurally trained as a threat-management discipline, leading to poor foundational risk reasoning among professionals…
Zheng-Xin Yong, Parv Mahajan, Andy Wang, Ida Caspary +11 more
The paper conducts a preliminary safety evaluation of the open-weight LLM Kimi K2.5, finding that while it is highly capable, it exhibits concerning dual-use risks, particularly regarding CBRNE misuse…
Yuxing Lu, Yushuhong Lin, Wenqi Shi, J. Ben Tamo +3 more
The paper introduces ClinEnv, a novel interactive, multi-stage benchmark designed to evaluate LLMs' decision-making and information-gathering process during longitudinal inpatient medical simulations.
The paper formalizes the concept of a causal pathway for rare events, showing that testable implications can be derived solely from this pathway abstraction, simplifying complex causal modeling.
The paper analyzes the CIIM risk model using postphenomenology, arguing that such formal models act as mediating artifacts that fundamentally shape how cybersecurity practitioners perceive and respond…
Yiran Qiao, Jing Chen, Jiaqi Xu, Yang Liu +2 more
The paper proposes a novel framework, LPCD, that uses latent causal modeling to robustly assess evolving adversarial risks in live streaming by decoupling malicious intent from superficial tactical sh…
Alex Leung, Rex Zhang, Ervin Ling, Kentaroh Toyoda +1 more
This paper maps the emerging insurability frontier of AI risk by coding 55 AI threat classes against 26 insurance products, identifying four tiers of coverage: affirmative, silent, excluded, and outsi…
The paper addresses the lack of user understanding regarding the actions and residual effects of advanced computer-use agents by proposing AgentTrace, a traceability framework for visualizing agent be…
The paper proposes MVRAF, a data-driven framework that quantifies vulnerability risk in large-scale cloud infrastructure by integrating multiple attack attributes and analyzing cumulative risk distrib…
The paper introduces SafetyDrift, a predictive model that forecasts when AI agents will violate safety protocols by analyzing the cumulative risk across sequences of individually safe actions.
The paper proposes a management framework, using a governed AI query-broker artifact, to safely integrate generative AI into high-risk operational decision support, such as Security Operations Centers…
This paper introduces an entropy-based method to generate multiple plausible causal maps (atlases) that accurately reflect the inherent structural ambiguity in complex systems, moving beyond single, o…
This paper introduces a foundational framework and taxonomy for managing catastrophic AI loss of control (LOC) incidents, providing a proportional guide for response based on the severity and recovera…
The study compared the cybersecurity risk assessment capabilities of five popular large language models (LLMs) against human experts, finding that LLMs consistently underestimated risks and require ma…
The paper investigates predictive multiplicity and arbitrariness in recidivism risk assessment, finding that similarly accurate models often exhibit high predictive agreement, and proposes a simple po…
Mikhail L. Arbuzov, Lee Mosbacker, Sisong Bei, Ziwei Dong +2 more
The paper reframes LLM reliability from an impossible universal problem to a manageable, local patch-based problem, showing that sufficient interventions can be found by focusing on recurring failure…
The paper introduces an agentic, framework-based system to transform under-specified academic papers into standardized, comparable, and executable benchmarks for industrial Prognostics and Health Mana…
Di Lu, Yongzhi Liao, Xutong Mu, Lele Zheng +4 more
The paper identifies that the convenience of host-acting agents leads to semantic under-specification in user goals, which forces the agent to generate potentially risky execution plans.