ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2605.06910v1· 20 results

cs.CRcs.AIRecentApr 1, 2026

Automated Framework to Evaluate and Harden LLM System Instructions against Encoding Attacks

Anubhab Sahu, Diptisha Samanta, Reza Soosahabi

The paper introduces an automated framework demonstrating that LLM system instructions are vulnerable to encoding attacks, where structured output requests can bypass safety refusals and leak sensitiv…

View →
cs.CRcs.AIcs.CLRecentApr 4, 2026

CREBench: Evaluating Large Language Models in Cryptographic Binary Reverse Engineering

Baicheng Chen, Yu Wang, Ziheng Zhou, Xiangru Liu +3 more

The paper introduces CREBench, a comprehensive benchmark for evaluating Large Language Models (LLMs) on cryptographic binary reverse engineering, finding that while LLMs show promise, human experts st…

View →
cs.SEcs.AIcs.CRRecentApr 10, 2026

DeepGuard: Secure Code Generation via Multi-Layer Semantic Aggregation

Li Huang, Zhongxin Liu, Yifan Wu, Tao Yin +5 more

DeepGuard introduces a novel multi-layer semantic aggregation framework to enhance secure code generation by collecting vulnerability cues from multiple upper layers of LLMs, significantly improving s…

View →
cs.CRcs.AIcs.LGRecentMay 22, 2026

An Empirical Evaluation of LLM-Generated Code Security Across Prompting Methods

Mohammed Kharma, Ahmed Sabbah, Mohammad Alkhanafseh, Mohammad Hammoudeh +1 more

The paper empirically evaluates the security quality of LLM-generated code across various prompting methods, finding that while prompting alters the structure of weaknesses, it is insufficient to reli…

View →
cs.CRcs.LGcs.SERecentApr 21, 2026

Evaluating LLM-Generated Obfuscated XSS Payloads for Machine Learning-Based Detection

Divyesh Gabbireddy, Suman Saha

This paper proposes a structured pipeline using LLMs to generate and evaluate obfuscated XSS payloads, demonstrating that while LLMs can generate samples, they currently struggle to ensure payloads ma…

View →
cs.CRcs.AIRecentApr 20, 2026

Understanding Secret Leakage Risks in Code LLMs: A Tokenization Perspective

Meifang Chen, Zhe Yang, Huang Nianchen, Yizhan Huang +3 more

This paper investigates how Byte-Pair Encoding (BPE) tokenization causes Code LLMs to disproportionately memorize certain types of secrets, a phenomenon termed 'gibberish bias'.

View →
cs.CRcs.LGRecentMay 28, 2026

Dissecting the Black Box: Circuit-Level Analysis of LLM Vulnerability Detection

Syafiq Al Atiiq, Chun Zhou, Christian Gehrmann

The paper analyzes LLM vulnerability detection using mechanistic interpretability, finding that models primarily rely on safety detectors rather than direct vulnerability signature recognition.

View →
cs.CRRecentMay 11, 2026

Towards LLM-Based Analysis of Virtualization-Obfuscated Code through Automated Data Generation

Sangjun An, Hyeyeon Park, Yejin Son, Seoksu Lee +1 more

The paper proposes a novel framework to analyze large, obfuscated binaries by decomposing them into structurally coherent units, enabling large-scale dataset generation for LLM-based analysis.

View →
cs.CRcs.AIcs.CLRecentMay 7, 2026

LeakDojo: Decoding the Leakage Threats of RAG Systems

Maosen Zhang, Jianshuo Dong, Boting Lu, Wenyue Li +3 more

The paper introduces LeakDojo, a framework that systematically evaluates RAG leakage risks, finding that stronger LLM instruction-following and query generation are major independent contributors to d…

View →
cs.CRcs.SERecentMar 24, 2026

Does Teaming-Up LLMs Improve Secure Code Generation? A Comprehensive Evaluation with Multi-LLMSecCodeEval

Bushra Sabir, Shigang Liu, Seung Ick Jang, Sharif Abuadbba +5 more

The paper evaluates multi-LLM strategies for secure code generation, finding that hybrid pipelines combining ensembling, static analysis, and patching achieve the strongest security performance, outpe…

View →
cs.CRcs.LGRecentApr 17, 2026

Surgical Repair of Insecure Code Generation in LLMs

Gustavo Sandoval, Brendan Dolan-Gavitt, Siddharth Garg

This paper identifies the 'Format-Reliability Gap'—where LLMs know about code vulnerabilities but generate insecure code anyway—and proposes a localized, per-vulnerability steering vector fix that sig…

View →
cs.CRcs.AIRecentApr 29, 2026

Autonomous LLM Agents & CTFs: A Second Look

Youness Bouchari, Matteo Boffa, Marco Mellia, Idilio Drago +2 more

The paper re-evaluates LLM agents on CTFs, finding that while general-purpose agents like claude-code are strong baselines, specialized, modular architectures significantly improve performance and con…

View →
cs.CRcs.AIRecentApr 4, 2026

SecPI: Secure Code Generation with Reasoning Models via Security Reasoning Internalization

Hao Wang, Niels Mündler, Mark Vero, Jingxuan He +2 more

The paper introduces SecPI, a fine-tuning pipeline that teaches reasoning language models (RLMs) to autonomously internalize structured security reasoning, significantly improving secure code generati…

View →
cs.CRcs.CLcs.LGRecentMay 22, 2026

What Does the Server See? Understanding Privacy Leakage from Large Language Models in Split Inference

Mingyuan Fan, Yu Liu, Fuyi Wang, Cen Chen

The paper introduces ActInv and PAF to systematically analyze and quantify privacy leakage from intermediate activations during split inference of LLMs, proposing PriPert for enhanced defense.

View →
cs.CRcs.AIcs.LGRecentMay 8, 2026

Defense effectiveness across architectural layers: a mechanistic evaluation of persistent memory attacks on stateful LLM agents

Jun Wen Leong

The paper systematically evaluates various defense mechanisms against persistent memory attacks on LLM agents, finding that only tool-gating at the memory layer (Memory Sandbox) effectively mitigates…

View →
cs.CRcs.AIcs.SERecentJun 3, 2026

Willing but Unable: Separating Refusal from Capability in Code LLMs via Abliteration

Cristina Carleo, Pietro Liguori, Naghmeh Ivaki, Domenico Cotroneo

The paper introduces 'abliteration,' a weight editing technique that successfully bypasses the refusal mechanism of safety-aligned Code LLMs, enabling scalable synthesis of vulnerable code from safe i…

View →
cs.SEcs.CRRecentMay 21, 2026

Finding Missing Input Validation in TEEs via LLM-Assisted Symbolic Execution

Chengyan Ma, Jieke Shi, Ruidong Han, Ye Liu +2 more

The paper introduces SymTEE, an LLM-assisted symbolic execution framework that detects missing input validation vulnerabilities in TEE applications without needing complex, real TEE setups.

View →
cs.CRcs.CLRecentApr 14, 2026

Compiling Activation Steering into Weights via Null-Space Constraints for Stealthy Backdoors

Rui Yin, Tianxu Han, Naen Xu, Changjiang Li +7 more

The paper proposes a novel method to inject reliable, sustained backdoors into LLMs by compiling an activation steering vector into model weights, ensuring the backdoor only activates upon a specific…

View →
cs.CRRecentMar 30, 2026

VulnScout-C: A Lightweight Transformer for C Code Vulnerability Detection

Aymen Lassoued, Nacef Mbarek, Bechir Dardouri, Bassem Ouni +2 more

The paper introduces VULNSCOUT-C, a compact, specialized transformer model that achieves state-of-the-art performance in C code vulnerability detection while maintaining low inference cost, making it…

View →
cs.SEcs.CRRecentMay 27, 2026

Towards Demystifying and Repairing LLM-in-the-Loop Vulnerabilities

Yujie Ma, Jialin Rong, Chenxi Yang, Lili Quan +3 more

The paper addresses the gap in understanding real-world LLM-in-the-loop vulnerabilities by creating the LLMCVE dataset and demonstrating that these vulnerabilities are significantly harder to repair t…

View →