~ similar to 2606.03386v1· 20 results
The paper proposes a unified closed-loop threat taxonomy to systematically analyze and defend foundation models by explicitly framing the bidirectional security interactions between data and models.
The paper introduces a challenging benchmark for LLM agents to perform unsupervised threat hunting on raw Windows event logs, finding that current frontier models perform poorly and are not ready for…
Ahmed Sabbah, Mohammed Kharma, Radi Jarrar, Samer Zein +1 more
This study longitudinally evaluates the adversarial robustness of Android malware detection systems over a decade, finding that temporal separation significantly degrades robustness due to concept dri…
The paper introduces an end-to-end framework that not only detects network intrusions using deep learning but also generates actionable, citation-grounded mitigation reports using a Retrieval-Augmente…
The paper investigates forecasting sparse and bursty vulnerability sightings, concluding that traditional time-series models like SARIMAX are inadequate, and count-based methods like Poisson regressio…
Taein Lim, Seongyong Ju, Munhyeok Kim, Hyunjun Kim +1 more
The paper introduces CyBiasBench, a comprehensive benchmark that quantifies the inherent, agent-specific bias in LLM agents' attack selection patterns in cybersecurity scenarios.
SentinelSphere is an AI platform that integrates advanced deep learning for real-time threat detection with an LLM-powered training system to holistically address both technical and human-factor cyber…
The paper introduces a comprehensive taxonomy and auditing framework to assess the collective coverage of existing LLM attack benchmarks, revealing significant and systematic gaps in current testing m…
ZERO-APT introduces a novel closed-loop adversarial framework for automated penetration testing that simulates attacks against an intelligent, real-time defending system, achieving a high attack succe…
The paper introduces the CAI Dataset, a massive, multi-terabyte corpus of real-world, hands-on cybersecurity LLM trajectories, designed to address the performance bottleneck caused by expert operator…
The paper argues that zero-day attacks primarily exploit undisclosed vulnerabilities rather than exhibiting novel behaviors, advocating for vulnerability-centric detection methods over purely behavior…
The paper introduces STRIDE-AI, a novel threat modeling framework that adapts classical STRIDE for generative AI, successfully reducing the attack success rate of a tested LLM chatbot from 80% to 15%.
Philip Huff, Dakota Dale, Harshith Guduru, Rohan Singh +1 more
The paper proposes a system that operationalizes cybersecurity governance frameworks by integrating them with attack-path modeling and Deep Reinforcement Learning to generate practical, resource-const…
The paper introduces GenTI, a novel LLM-driven benchmark and dataset, to automatically generate high-quality, deployable IDPS rules for detecting unseen and zero-day cyber attacks.
This study empirically measures the consistency and success rate of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation capabilit…
This study empirically measures the consistency and effectiveness of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation rates am…
Jianan Huang, Rodolfo V. Valentim, Luca Vassio, Matteo Boffa +3 more
The paper proposes a multi-modal contrastive learning framework to improve the generalization of machine learning models in cybersecurity by transferring knowledge from rich textual vulnerability desc…
The paper empirically evaluates domain-adapted and general-purpose LLMs for structured threat modelling (STRIDE on 5G security), finding that domain adaptation and model size do not guarantee reliable…
The paper demonstrates that advanced capabilities, such as jailbreaking large language models and finding software vulnerabilities, can be achieved effectively at zero cost by coordinating multiple sm…
The paper introduces Auto-ART, a comprehensive open-source framework that provides structured meta-analysis and automated testing for adversarial robustness, revealing significant gaps in current ML s…