~ similar to 2604.24083v1· 20 results
The paper proposes a novel information-geometric framework to analyze LLM stability by integrating task utility, external entropy, and internal structural proxies, showing this composite score improve…
This paper surveys the risks associated with world models, proposing a unified threat model and demonstrating adversarial attacks that show world models require rigorous safety standards comparable to…
The paper demonstrates that fine-tuning safety guard models on benign data can catastrophically collapse their safety alignment, proposing Fisher-Weighted Safety Subspace Regularization (FW-SSR) to ac…
The paper demonstrates that standard homomorphic encryption (HE) schemes are insufficient to guarantee integrity in networked control systems (NCS) against covert attacks, proposing instead a verifiab…
The paper introduces SafetyDrift, a predictive model that forecasts when AI agents will violate safety protocols by analyzing the cumulative risk across sequences of individually safe actions.
The paper develops a unified mathematical framework to analyze the interaction between post-quantum security, real-time communication constraints, and closed-loop stability in safety-critical turbofan…
The paper proposes using geometric metrics, specifically eigenspace alignment, to monitor the structural integrity of large behavioral populations, demonstrating its effectiveness in detecting network…
The paper proposes an algorithmic method using conformal prediction to formally certify high-probability safety for Belief-Space Neural Safety Filters (BeliefSF), significantly improving safety guaran…
Kerri Prinos, Lilianne Brush, Cameron Denton, Zhanqi Wang +4 more
The paper proposes a tool-mediated LLM architecture for autonomous cyber defense, formally proving its stability and demonstrating that it significantly reduces an attacker's expected payoff in real-w…
Junze Zhu, Weihao Chen, Xuanwang Zhang, Zhen Wu +1 more
The paper proposes an Entropy Dynamics framework to analyze the stability and failure modes of centralized orchestration in Multi-Agent Systems, identifying a 'Reasoning Trap' where complex reasoning…
This paper provides the first systematic threat analysis of State-Space Models (SSMs) in safety-critical applications, introducing novel attack classes and formal metrics to quantify their security an…
AEGIS introduces a novel physics-based system that analyzes encrypted network traffic flow dynamics, achieving state-of-the-art zero-day evasion detection with high accuracy and low latency.
The paper evaluates quantum machine learning for detecting anomalies in UAVs using a rigorous, leakage-free methodology, showing that a hybrid XGBoost + Data Reuploading classifier performs well, part…
Zheng-Xin Yong, Parv Mahajan, Andy Wang, Ida Caspary +11 more
The paper conducts a preliminary safety evaluation of the open-weight LLM Kimi K2.5, finding that while it is highly capable, it exhibits concerning dual-use risks, particularly regarding CBRNE misuse…
Saurabh Bagchi, Hyunseung Kim, Tarek Abdelzaher, Homa Alemzadeh +19 more
This survey provides a comprehensive, systematic roadmap for achieving cyber-physical system (CPS) resilience by integrating five interconnected themes: system-wide properties, handling data scarcity…
The paper proposes the Policy-Execution-Authorization (PEA) architecture, a separation-of-powers system designed to structurally enforce goal integrity in AI agents, moving safety from a probabilistic…
The paper proves that standard runtime enforcement mechanisms cannot detect systematic behavioral drift in autonomous agents, proposing a new Invariant Measurement Layer (IML) that restores observabil…
The paper introduces a unified Physics-Informed Deep Learning (PIDL) framework that simultaneously enforces physical laws and information-theoretic bounds, demonstrating robust, domain-agnostic entrop…
This paper introduces the first agent-based model for the FAIR-CAM framework, demonstrating that complex, dynamic control degradation and resource constraints lead to emergent security vulnerabilities…
Ting Xu, Xu He, Yupu Lu, Jiankai Sun +3 more
The paper analyzes the entropy dynamics of Chain-of-Thought (CoT) reasoning, identifying a transition from an exploratory Uncertainty Region to a stable Confidence Region, which enables superior early…