~ similar to 2605.31520· 20 results
The paper introduces a hybrid CNN-CodeBERT framework for three-class credential leakage detection, significantly improving accuracy by explicitly distinguishing genuine secrets from non-secret placeho…
Zhihao Chen, Ying Zhang, Yi Liu, Gelei Deng +6 more
This study conducts a large-scale empirical analysis of third-party LLM agent skills, identifying that credential leakage is a pervasive, cross-modal issue primarily caused by debug logging and result…
Maosen Zhang, Jianshuo Dong, Boting Lu, Wenyue Li +3 more
The paper introduces LeakDojo, a framework that systematically evaluates RAG leakage risks, finding that stronger LLM instruction-following and query generation are major independent contributors to d…
Meifang Chen, Zhe Yang, Huang Nianchen, Yizhan Huang +3 more
This paper investigates how Byte-Pair Encoding (BPE) tokenization causes Code LLMs to disproportionately memorize certain types of secrets, a phenomenon termed 'gibberish bias'.
This paper proposes a lightweight, fast vulnerability detection pipeline for C/C++ code using simple token n-grams and basic code metrics, achieving a PR-AUC of 0.642 on random splits but showing limi…
This paper proposes a multi-layered defense strategy combining pre-output monitoring, calibrated canary detection, and cumulative information-flow tracking to prevent LLM agents from exfiltrating sens…
The paper introduces HackerSignal, a massive, multi-source benchmark dataset that uniquely links hacker community discourse to the entire CVE vulnerability lifecycle, enabling advanced temporal cyber…
The paper introduces a systematic benchmark to test LLMs' ability to recover Indicators of Compromise (IoCs) from JavaScript code, finding that while LLMs handle simple obfuscation well, encryption-ba…
The paper introduces an efficient, lightweight LLM framework for smart contract auditing that decouples the audit process into multiple components, achieving high accuracy while significantly reducing…
The paper introduces GuardPhish, a large-scale dataset and evaluation framework, demonstrating that even high-performing open-source LLMs can generate actionable phishing content despite accurate inte…
BackFlush introduces a novel, knowledge-free framework that detects and eliminates unknown backdoor attacks in LLMs while simultaneously preserving existing watermarks, achieving high detection rates…
The paper introduces 'abliteration,' a weight editing technique that successfully bypasses the refusal mechanism of safety-aligned Code LLMs, enabling scalable synthesis of vulnerable code from safe i…
The paper introduces codebadger, a Model Context Protocol (MCP) server that integrates Joern's Code Property Graph (CPG) with LLMs, enabling large language models to perform large-scale, semantic prog…
Yuqing Nie, Chong Wang, Guosheng Xu, Guoai Xu +3 more
MATRIX is a novel, robust code watermarking framework that encodes watermarks using constrained parity-check matrix equations, achieving high detection accuracy and improved robustness for code proven…
Maofei Chen, Laifu Wang, Yue Qin, Yuan Wang +2 more
The paper demonstrates that using raw source text for fine-tuning LLMs on vulnerability detection causes high false-positive rates by memorizing surface-level syntax, a problem mitigated by using Abst…
Aymen Lassoued, Nacef Mbarek, Bechir Dardouri, Bassem Ouni +2 more
The paper introduces VULNSCOUT-C, a compact, specialized transformer model that achieves state-of-the-art performance in C code vulnerability detection while maintaining low inference cost, making it…
Shenao Yan, Shimaa Ahmed, Shan Jin, Sunpreet S. Arora +3 more
The paper introduces CodeScan, a novel black-box framework that detects data poisoning in code generation LLMs by analyzing structural similarities across multiple generations to identify recurring, v…
Tingda Shen, Yebo Feng, Konglin Zhu, Xiaojun Jia +2 more
The paper introduces SIGIL, a novel framework that cryptographically seals the entire lifecycle of LLM skills, ensuring verifiable integrity from publication through runtime execution to prevent suppl…
Nils Loose, Joseph Bienhüls, Kristoffer Hempel, Felix Mächtle +1 more
The paper evaluates code language model-based detection of vulnerability-fixing commits (VFCs) using a unified benchmark and concludes that code changes alone are insufficient for accurate detection,…
This paper proposes an empirical methodology to automate web application trustworthiness assessment by leveraging Large Language Models (LLMs) to verify adherence to secure coding practices, showing t…