~ similar to 2605.06910v1· 20 results
The paper introduces an automated framework demonstrating that LLM system instructions are vulnerable to encoding attacks, where structured output requests can bypass safety refusals and leak sensitiv…
Baicheng Chen, Yu Wang, Ziheng Zhou, Xiangru Liu +3 more
The paper introduces CREBench, a comprehensive benchmark for evaluating Large Language Models (LLMs) on cryptographic binary reverse engineering, finding that while LLMs show promise, human experts st…
Li Huang, Zhongxin Liu, Yifan Wu, Tao Yin +5 more
DeepGuard introduces a novel multi-layer semantic aggregation framework to enhance secure code generation by collecting vulnerability cues from multiple upper layers of LLMs, significantly improving s…
The paper empirically evaluates the security quality of LLM-generated code across various prompting methods, finding that while prompting alters the structure of weaknesses, it is insufficient to reli…
This paper proposes a structured pipeline using LLMs to generate and evaluate obfuscated XSS payloads, demonstrating that while LLMs can generate samples, they currently struggle to ensure payloads ma…
Meifang Chen, Zhe Yang, Huang Nianchen, Yizhan Huang +3 more
This paper investigates how Byte-Pair Encoding (BPE) tokenization causes Code LLMs to disproportionately memorize certain types of secrets, a phenomenon termed 'gibberish bias'.
The paper analyzes LLM vulnerability detection using mechanistic interpretability, finding that models primarily rely on safety detectors rather than direct vulnerability signature recognition.
Sangjun An, Hyeyeon Park, Yejin Son, Seoksu Lee +1 more
The paper proposes a novel framework to analyze large, obfuscated binaries by decomposing them into structurally coherent units, enabling large-scale dataset generation for LLM-based analysis.
Maosen Zhang, Jianshuo Dong, Boting Lu, Wenyue Li +3 more
The paper introduces LeakDojo, a framework that systematically evaluates RAG leakage risks, finding that stronger LLM instruction-following and query generation are major independent contributors to d…
Bushra Sabir, Shigang Liu, Seung Ick Jang, Sharif Abuadbba +5 more
The paper evaluates multi-LLM strategies for secure code generation, finding that hybrid pipelines combining ensembling, static analysis, and patching achieve the strongest security performance, outpe…
This paper identifies the 'Format-Reliability Gap'—where LLMs know about code vulnerabilities but generate insecure code anyway—and proposes a localized, per-vulnerability steering vector fix that sig…
Youness Bouchari, Matteo Boffa, Marco Mellia, Idilio Drago +2 more
The paper re-evaluates LLM agents on CTFs, finding that while general-purpose agents like claude-code are strong baselines, specialized, modular architectures significantly improve performance and con…
Hao Wang, Niels Mündler, Mark Vero, Jingxuan He +2 more
The paper introduces SecPI, a fine-tuning pipeline that teaches reasoning language models (RLMs) to autonomously internalize structured security reasoning, significantly improving secure code generati…
The paper introduces ActInv and PAF to systematically analyze and quantify privacy leakage from intermediate activations during split inference of LLMs, proposing PriPert for enhanced defense.
The paper systematically evaluates various defense mechanisms against persistent memory attacks on LLM agents, finding that only tool-gating at the memory layer (Memory Sandbox) effectively mitigates…
The paper introduces 'abliteration,' a weight editing technique that successfully bypasses the refusal mechanism of safety-aligned Code LLMs, enabling scalable synthesis of vulnerable code from safe i…
Chengyan Ma, Jieke Shi, Ruidong Han, Ye Liu +2 more
The paper introduces SymTEE, an LLM-assisted symbolic execution framework that detects missing input validation vulnerabilities in TEE applications without needing complex, real TEE setups.
Rui Yin, Tianxu Han, Naen Xu, Changjiang Li +7 more
The paper proposes a novel method to inject reliable, sustained backdoors into LLMs by compiling an activation steering vector into model weights, ensuring the backdoor only activates upon a specific…
Aymen Lassoued, Nacef Mbarek, Bechir Dardouri, Bassem Ouni +2 more
The paper introduces VULNSCOUT-C, a compact, specialized transformer model that achieves state-of-the-art performance in C code vulnerability detection while maintaining low inference cost, making it…
Yujie Ma, Jialin Rong, Chenxi Yang, Lili Quan +3 more
The paper addresses the gap in understanding real-world LLM-in-the-loop vulnerabilities by creating the LLMCVE dataset and demonstrating that these vulnerabilities are significantly harder to repair t…