~ similar to 2603.20108v1· 20 results
This paper analyzes the latency-accuracy trade-offs of various TinyML models for detecting diverse cyber-RF threats on autonomous spacecraft, finding that Logistic Regression offers an effective, low-…
Quang Duc Nguyen, Siyuan Liang, Yiming Li, Fushuo Huo +1 more
The paper proposes TimeGuard, a novel channel-wise pool training defense, to significantly improve the robustness of time series forecasting against backdoor attacks by addressing signal dilution and…
This paper provides the first comprehensive threat model for IoT-enabled Controlled Environment Agriculture (CEA) systems, identifying 123 unique threats and proposing a defense-in-depth framework to…
The paper introduces a challenging benchmark for LLM agents to perform unsupervised threat hunting on raw Windows event logs, finding that current frontier models perform poorly and are not ready for…
The paper proposes INTARG, an informed and selective adversarial attack framework for time-series forecasting that significantly increases prediction error by targeting only the most vulnerable time s…
The paper introduces COBALT, a Z3 SMT-based formal verification engine, to proactively detect arithmetic vulnerabilities (CWE-190/191/195) in the critical infrastructure surrounding frontier AI models…
The paper investigates forecasting sparse and bursty vulnerability sightings, concluding that traditional time-series models like SARIMAX are inadequate, and count-based methods like Poisson regressio…
Rishikesh Sahay, Bell Eapen, Weizhi Meng, Md Rasel Al Mamun +4 more
The paper proposes an automated, LLM-enabled threat hunting framework integrated with Splunk to help SOC analysts autonomously monitor evolving threats and prioritize suspicious network traffic.
Vivek Dahiya, Sunny Nehra, Vipul Dholariya, Bhavik Shangari +1 more
The paper evaluates frontier LLMs on cybersecurity tasks using dual-mode benchmarks and concludes that general-purpose models are insufficient, advocating for specialized, vertical foundation models.
This study empirically measures the consistency and success rate of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation capabilit…
This study empirically measures the consistency and effectiveness of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation rates am…
This paper addresses the critical need for trustworthy LLMs in science by proposing a comprehensive, multi-layered defense framework and methodology to evaluate unique scientific vulnerabilities.
Hyomin Lee, Sangwoo Park, Yumin Choi, Sohyun An +2 more
The paper introduces T-MAP, a trajectory-aware evolutionary search method, to discover and generate multi-step adversarial prompts that exploit vulnerabilities in autonomous LLM agents through tool ex…
This paper surveys the risks associated with world models, proposing a unified threat model and demonstrating adversarial attacks that show world models require rigorous safety standards comparable to…
This paper provides the first systematic threat analysis of State-Space Models (SSMs) in safety-critical applications, introducing novel attack classes and formal metrics to quantify their security an…
The paper proposes a Digital Twin (DT)-driven hybrid system that combines deterministic heuristics and constrained Large Language Model (LLM) reasoning to achieve highly accurate and interpretable rea…
The paper identifies a critical vulnerability, the Camouflage Detection Gap (CDG), where standard LLM injection detectors fail dramatically when malicious payloads mimic the target domain's language a…
Shuhao Zhang, Jiarui Li, Qi Cao, Ruiyi Zhang +1 more
The paper introduces SCOUT, a dynamic detector allocation framework that improves prompt-injection defense by predicting detector reliability and latency to optimize the trade-off between safety and o…
The paper systematically maps LLM agent vulnerabilities by testing 10,000 prompt variations, finding that 'goal reframing' language is the primary trigger for exploitation, rather than broad adversari…
The paper demonstrates that advanced capabilities, such as jailbreaking large language models and finding software vulnerabilities, can be achieved effectively at zero cost by coordinating multiple sm…