Papers similar to 2605.26791v1

~ similar to 2605.26791v1· 20 results

cs.CRcs.SERecentMay 6, 2026

Evolution of Log-Based Detection Rules in Public Repositories

This paper provides the first longitudinal analysis of log-based detection rule evolution in public repositories, finding that rule changes reflect ongoing operational trade-offs rather than steady co…

View →

cs.CRRecentMay 20, 2026

A Large Language Model Approach to Generating Bypass Rules for Malware Evasion in Analysis Sandbox

Zhiyong Sui, Lamine Noureddine, Mst Eshita Khatun, Sideeq Bello +2 more

The paper introduces ABLE, an LLM-based system that automatically generates YARA rules to bypass malware evasion checks in analysis sandboxes, achieving a 79% bypass success rate.

View →

cs.CRcs.LGRecentApr 24, 2026

Detecting Concept Drift in Evolving Malware Families Using Rule-Based Classifier Representations

Tomáš Kalný, Martin Jureček, Mark Stamp

The paper proposes a structural method using decision tree rulesets and multiple complementary metrics to detect concept drift in evolving malware families, finding that fixed-interval windowing with…

View →

cs.CRcs.LGRecentMay 7, 2026

Beyond the Wrapper: Identifying Artifact Reliance in Static Malware Classifiers using TRUSTEE

Riyazuddin Mohammed, Lan Zhang

The paper demonstrates that static malware classifiers often rely on superficial artifacts like packing and metadata rather than true malicious semantics, using the TRUSTEE interpretability tool to di…

View →

cs.CRcs.LGRecentApr 30, 2026

Trident: Improving Malware Detection with LLMs and Behavioral Features

Rebecca Saul, Jingzhi Jiang, Elliott Chia, David Wagner

The paper introduces Trident, a novel malware detection system that combines static features, LLM-derived behavioral rules, and direct LLM analysis to achieve superior robustness against concept drift…

View →

cs.CRcs.SERecentMay 4, 2026

A Validated Prompt Bank for Malicious Code Generation: Separating Executable Weapons from Security Knowledge in 1,554 Consensus-Labeled Prompts

Richard J. Young, Gregory D. Moody

The paper introduces a validated, consensus-labeled prompt bank that separates requests for executable malicious code (weapons) from requests for general harmful security knowledge, providing a more g…

View →

cs.CRRecentApr 25, 2026

AsmRAG: LLM-Driven Malware Detection by Retrieving Functionally Similar Assembly Code

ElMouatez Billah Karbab

AsmRAG is a novel framework that improves malware detection by treating it as an evidence-based retrieval task using a code-specialized LLM, achieving high accuracy while providing transparent forensi…

View →

cs.CRcs.AIcs.LGRecentJun 2, 2026

High-Precision APT Malware Attribution with Out-of-Scope Resilience

Peter Williams, Adam Sobey, Erisa Karafili

The paper introduces a high-precision APT malware attribution method that uses ranked binary classifiers with explicit abstention, significantly improving accuracy when encountering unknown or out-of-…

View →

cs.CRcs.LGcs.SERecentMay 16, 2026

The Range Shrinks, the Threat Remains: Re-evaluating LLM Package Hallucinations on the 2026 Frontier-Model Cohort

Aleksandr Churilov

This study re-evaluates LLM package hallucination rates on a new cohort of frontier models, finding a significant reduction in overall hallucination rates but identifying a persistent, model-agnostic…

View →

cs.LGcs.CRRecentMar 30, 2026

Label-efficient Training Updates for Malware Detection over Time

Luca Minnei, Cristian Manca, Giorgio Piras, Angelo Sotgiu +5 more

The paper proposes a model-agnostic framework to evaluate combining Active Learning (AL) and Semi-Supervised Learning (SSL) techniques for malware detection, demonstrating that these combined methods…

View →

cs.CRRecentMay 23, 2026

Analyzing Concentration, Temporal Routines and Targeting in Public Ransomware Leak Site Data

Lea Müller, York Yannikos

By analyzing over 27,000 posts from 325 public ransomware leak sites, this paper demonstrates that ransomware groups exhibit non-random, predictable operational regularities concerning victim concentr…

View →

cs.CRcs.CLcs.LGRecentMay 27, 2026

Code as a Weapon: A Consensus-Labeled Prompt Bank for Measuring Coding-Model Compliance with Malicious-Code Requests

Richard J. Young, Gregory D. Moody

The paper introduces a large, consensus-labeled prompt bank that reliably distinguishes between requests for executable malicious code and requests for harmful security knowledge, providing a standard…

View →

cs.CRcs.AIRecentJun 1, 2026

Large Byte Model: Teaching Language Models About Compiled Code

Florian Störtz, Catalin-Andrei Stan, Alexandru Dinu, Sandra Servia-Rodríguez +3 more

The paper introduces the first byte-native Large Language Model (LLM) capable of analyzing raw executable binary data, achieving high accuracy in tasks like malware and architecture classification.

View →

cs.CRcs.AIcs.SERecentMay 31, 2026

ClawHub Security Signals: When VirusTotal, Static Analysis, and SkillSpector Disagree

Vincent Koc, Patrick Erichsen, Jacob Tomlinson, Agustin Rivera +2 more

The paper analyzes a dataset of agent skills, demonstrating that different security scanners (VirusTotal, static analysis, SkillSpector) rarely agree, necessitating a layered governance approach for s…

View →

cs.CRcs.AIcs.SERecentMay 31, 2026

ClawHub Security Signals: When VirusTotal, Static Analysis, and SkillSpector Disagree

Vincent Koc, Patrick Erichsen, Jacob Tomlinson, Agustin Rivera +2 more

The paper analyzes a dataset of agent skills, demonstrating that different security scanners (VirusTotal, static analysis, SkillSpector) rarely agree on maliciousness, necessitating layered security g…

View →

cs.CRcs.AIcs.CLRecentApr 2, 2026

RuleForge: Automated Generation and Validation for Web Vulnerability Detection at Scale

Ayush Garg, Sophia Hager, Jacob Montiel, Aditya Tiwari +4 more

RuleForge is an automated system that generates and validates detection rules for web vulnerabilities from structured CVE templates, significantly improving detection accuracy and reducing false posit…

View →

cs.CRcs.AIcs.CYRecentMay 13, 2026

Identifying AI Web Scrapers Using Canary Tokens

Steven Seiden, Triss Ren, Caroline Zhang, Taein Kim +2 more

The paper proposes a novel, scalable technique using unique canary tokens to automatically and accurately identify which web scrapers are feeding data to specific Large Language Models (LLMs).

View →

cs.CRRecentMay 15, 2026

MalwarePT: A Binary-Level Foundation Model for Malware Analysis

Saastha Vasan, Yuzhou Nie, Kaie Chen, Yigitcan Kaya +5 more

MalwarePT introduces a novel binary-level foundation model, pretrained on Windows PE code-section bytes using a ModernBERT-style encoder, demonstrating superior transfer learning capabilities across v…

View →

cs.CRcs.LGRecentMay 7, 2026

McNdroid: A Longitudinal Multimodal Benchmark for Robust Drift Detection in Android Malware

Md Mahmuduzzaman Kamol, Jesus Lopez, Saeefa Rubaiyet Nowmi, Emilia Rivas +4 more

The paper introduces McNdroid, a large longitudinal multimodal benchmark for Android malware, demonstrating that temporal drift significantly degrades detection performance, which is best mitigated by…

View →

cs.CRcs.AIRecentMay 8, 2026

CyBiasBench: Benchmarking Bias in LLM Agents for Cyber-Attack Scenarios

Taein Lim, Seongyong Ju, Munhyeok Kim, Hyunjun Kim +1 more

The paper introduces CyBiasBench, a comprehensive benchmark that quantifies the inherent, agent-specific bias in LLM agents' attack selection patterns in cybersecurity scenarios.

View →