~ similar to 2605.26791v1· 20 results
This paper provides the first longitudinal analysis of log-based detection rule evolution in public repositories, finding that rule changes reflect ongoing operational trade-offs rather than steady co…
The paper introduces ABLE, an LLM-based system that automatically generates YARA rules to bypass malware evasion checks in analysis sandboxes, achieving a 79% bypass success rate.
The paper proposes a structural method using decision tree rulesets and multiple complementary metrics to detect concept drift in evolving malware families, finding that fixed-interval windowing with…
The paper demonstrates that static malware classifiers often rely on superficial artifacts like packing and metadata rather than true malicious semantics, using the TRUSTEE interpretability tool to di…
The paper introduces Trident, a novel malware detection system that combines static features, LLM-derived behavioral rules, and direct LLM analysis to achieve superior robustness against concept drift…
The paper introduces a validated, consensus-labeled prompt bank that separates requests for executable malicious code (weapons) from requests for general harmful security knowledge, providing a more g…
AsmRAG is a novel framework that improves malware detection by treating it as an evidence-based retrieval task using a code-specialized LLM, achieving high accuracy while providing transparent forensi…
The paper introduces a high-precision APT malware attribution method that uses ranked binary classifiers with explicit abstention, significantly improving accuracy when encountering unknown or out-of-…
This study re-evaluates LLM package hallucination rates on a new cohort of frontier models, finding a significant reduction in overall hallucination rates but identifying a persistent, model-agnostic…
Luca Minnei, Cristian Manca, Giorgio Piras, Angelo Sotgiu +5 more
The paper proposes a model-agnostic framework to evaluate combining Active Learning (AL) and Semi-Supervised Learning (SSL) techniques for malware detection, demonstrating that these combined methods…
By analyzing over 27,000 posts from 325 public ransomware leak sites, this paper demonstrates that ransomware groups exhibit non-random, predictable operational regularities concerning victim concentr…
The paper introduces a large, consensus-labeled prompt bank that reliably distinguishes between requests for executable malicious code and requests for harmful security knowledge, providing a standard…
The paper introduces the first byte-native Large Language Model (LLM) capable of analyzing raw executable binary data, achieving high accuracy in tasks like malware and architecture classification.
Vincent Koc, Patrick Erichsen, Jacob Tomlinson, Agustin Rivera +2 more
The paper analyzes a dataset of agent skills, demonstrating that different security scanners (VirusTotal, static analysis, SkillSpector) rarely agree, necessitating a layered governance approach for s…
Vincent Koc, Patrick Erichsen, Jacob Tomlinson, Agustin Rivera +2 more
The paper analyzes a dataset of agent skills, demonstrating that different security scanners (VirusTotal, static analysis, SkillSpector) rarely agree on maliciousness, necessitating layered security g…
Ayush Garg, Sophia Hager, Jacob Montiel, Aditya Tiwari +4 more
RuleForge is an automated system that generates and validates detection rules for web vulnerabilities from structured CVE templates, significantly improving detection accuracy and reducing false posit…
Steven Seiden, Triss Ren, Caroline Zhang, Taein Kim +2 more
The paper proposes a novel, scalable technique using unique canary tokens to automatically and accurately identify which web scrapers are feeding data to specific Large Language Models (LLMs).
Saastha Vasan, Yuzhou Nie, Kaie Chen, Yigitcan Kaya +5 more
MalwarePT introduces a novel binary-level foundation model, pretrained on Windows PE code-section bytes using a ModernBERT-style encoder, demonstrating superior transfer learning capabilities across v…
The paper introduces McNdroid, a large longitudinal multimodal benchmark for Android malware, demonstrating that temporal drift significantly degrades detection performance, which is best mitigated by…
Taein Lim, Seongyong Ju, Munhyeok Kim, Hyunjun Kim +1 more
The paper introduces CyBiasBench, a comprehensive benchmark that quantifies the inherent, agent-specific bias in LLM agents' attack selection patterns in cybersecurity scenarios.