~ similar to 2605.20734v1· 20 results
The paper introduces a novel framework using steganographic canary files to detect and block unauthorized processing of sensitive documents by LLMs, even when the data passes through traditional secur…
The paper introduces AgentSecBench, a security evaluation framework that measures prompt injection, privacy leakage, and tool-use integrity in LLM agents by defining formal security games and testing…
Yaofei Wang, Rui Wang, Weilong Pang, JiaLiang Han +3 more
The paper introduces ReTokSync, a self-synchronizing framework that resolves tokenization ambiguity in Generative Linguistic Steganography (GLS) by correcting mismatches only when they occur, thereby…
This paper addresses the vulnerability of existing LLM safety monitors to adaptive attackers and proposes activation watermarking, a technique that significantly improves detection robustness against…
This paper proposes a multi-layered defense strategy combining pre-output monitoring, calibrated canary detection, and cumulative information-flow tracking to prevent LLM agents from exfiltrating sens…
This paper develops provably undetectable and robust watermarking schemes for LLM outputs even when the per-token entropy is only constant, removing previous dependencies on high entropy rates or larg…
Hanzhi Liu, Chaofan Shou, Hongbo Wen, Yanju Chen +2 more
This paper systematically analyzes the threat posed by malicious third-party API routers in the LLM supply chain, finding that a significant number of routers actively perform payload injection, crede…
Xinlei Guan, David Arosemena, Tejaswi Dhandu, Kuan Huang +6 more
The paper proposes an end-to-end forensic pipeline using steganographic attribution and multimodal harm detection to reliably trace and attribute harmful misuse of AI-generated imagery on social platf…
Zhihao Chen, Ying Zhang, Yi Liu, Gelei Deng +6 more
This study conducts a large-scale empirical analysis of third-party LLM agent skills, identifying that credential leakage is a pervasive, cross-modal issue primarily caused by debug logging and result…
The paper demonstrates a class of steganographic exfiltration attacks against vector databases by hiding data within embeddings, and proposes VectorPin, a cryptographic provenance protocol to detect s…
The paper identifies a critical vulnerability, the Camouflage Detection Gap (CDG), where standard LLM injection detectors fail dramatically when malicious payloads mimic the target domain's language a…
The paper proposes a provably secure steganography scheme based on list decoding that significantly increases embedding capacity for Large Language Models (LLMs) compared to existing methods.
Yuqing Nie, Chong Wang, Guosheng Xu, Guoai Xu +3 more
MATRIX is a novel, robust code watermarking framework that encodes watermarks using constrained parity-check matrix equations, achieving high detection accuracy and improved robustness for code proven…
The paper formalizes and quantifies the risk of side-channel leakage from public metrology releases by developing a statistical audit framework that yields precise information-theoretic bounds.
The paper systematically evaluates eight privacy-preserving techniques for LLM requests, finding that a combination of local inference, redaction, and semantic rephrasing provides the best overall pro…
Zhen Sun, Zongmin Zhang, Leyi Sheng, Yule Liu +6 more
The paper introduces SADBench, a systematic benchmark designed to evaluate both the effectiveness of steganographic attacks injecting harmful content and the robustness of steganalysis defenses agains…
Leyi Qi, Yiming Li, Siyuan Liang, Zhengzhong Tu +1 more
The paper proposes Cert-LAS, a novel certified method for verifying model ownership in text-to-image diffusion models, which is robust against malicious signal removal attacks.
The paper proposes an efficient and provably secure linguistic steganography method using range coding that achieves high embedding capacity and speed, outperforming existing methods.
AEGIS introduces a novel physics-based system that analyzes encrypted network traffic flow dynamics, achieving state-of-the-art zero-day evasion detection with high accuracy and low latency.
The paper proposes a graph-based framework for detecting attacks in LLM agent tool-call traffic, finding that content-level embeddings are crucial for high accuracy and that tree ensembles on these em…