~ similar to 2605.20368v1· 20 results
Yeseul E. Chang, Rahul Kailasa, Simon Shim, Byunghoon Oh +1 more
The paper proposes Retrieval Augmented Classification (RAC) as a robust, low-leakage method for classifying confidential documents, demonstrating that RAC outperforms supervised fine-tuning (FT) parti…
OpenSOC-AI is a lightweight framework that uses parameter-efficient fine-tuning of a small LLM to automate threat classification and severity assessment from raw security logs, significantly improving…
The paper establishes a standardized security assessment framework and develops a multi-layered defensive system, demonstrating that systematic testing and external defenses are crucial for safe LLM d…
The paper empirically evaluates domain-adapted and general-purpose LLMs for structured threat modelling (STRIDE on 5G security), finding that domain adaptation and model size do not guarantee reliable…
The paper proposes an unsupervised method using multiple statistical indicators to detect adversarial or compromised context documents in Retrieval Augmented Generation (RAG) systems, even without kno…
The paper introduces HIDBench, a new benchmark for evaluating LLMs' ability to perform host-based intrusion detection using complex, noisy system logs, finding that model performance degrades signific…
The paper introduces a novel guardrail orchestration layer that improves the compliance and efficiency of high-stakes multimodal document generation by scoring multiple generated candidates against we…
Yuming Xu, Mingtao Zhang, Zhuohan Ge, Haoyang Li +6 more
This paper proposes a comprehensive taxonomy (SLOT) to systematically categorize security risks, attacks, and defenses specific to Retrieval-Augmented Generation (RAG), clarifying that these risks are…
The paper proposes an embarrassingly simple detector that monitors model extraction attacks by testing whether the aggregate distribution of incoming LLM queries deviates from the historical distribut…
This paper introduces a machine learning system that detects phishing emails by analyzing contextual features from the entire email body content, achieving 95.41% accuracy using Logistic Regression.
The paper introduces RealVuln, a benchmark that demonstrates a clear three-tier performance hierarchy for security scanners on real-world code, with specialized tools significantly outperforming gener…
Haodong Zhao, Tianyi Xu, Tianhang Zhao, Zhuosheng Zhang +1 more
GradSentry introduces a novel backdoor sample filtering method that uses the spectral entropy of individual sample gradients to detect poisoned data during LLM fine-tuning, proving effective even at h…
Ayush Garg, Sophia Hager, Jacob Montiel, Aditya Tiwari +4 more
RuleForge is an automated system that generates and validates detection rules for web vulnerabilities from structured CVE templates, significantly improving detection accuracy and reducing false posit…
The paper proposes an end-to-end LLM framework that automates SOC operations by integrating ensemble-based threat detection, syntax-constrained query generation, and evidence-grounded incident resolut…
Osama Zafar, Alexander Nemecek, Yiqian Zhang, Wenbiao Li +4 more
The paper introduces a Privacy Policy Enforcement (PPE) framework using dual one-class density estimators to detect contextual data leakage in Retrieval-Augmented Generation (RAG) systems, achieving h…
This paper develops and evaluates supervised machine learning models to detect malicious tool descriptions within the Model Context Protocol (MCP), achieving high detection rates in both binary and mu…
The paper systematically compares multimodal transformer and LLM approaches for document type classification, finding that specialized multimodal Transformers outperform LLM-based models, especially w…
The paper demonstrates that relying on strict regular-expression parsing for evaluating LLM-based security log classifiers introduces systematic errors, potentially causing a functional model to appea…
Aiman Al Masoud, Antony Anju, Marco Arazzi, Mert Cihangiroglu +5 more
This paper provides the first comprehensive Systematization of Knowledge (SoK) on the security aspects of LLM-as-a-Judge (LaaJ) systems, identifying key vulnerabilities and proposing a taxonomy for fu…
Xavier Cadet, Aditya Vikram Singh, Harsh Mamania, Edward Koh +5 more
The paper introduces a Retrieval-Augmented Generation (RAG) system that uses targeted query filtering and LLM semantic reasoning to accurately and cost-effectively analyze complex cybersecurity incide…