~ similar to 2604.18179v3· 20 results
The paper introduces KBF, a low-cost black-box auditing protocol that fingerprints LLM APIs by analyzing stable numerical recall near the knowledge boundary, successfully detecting numerous model subs…
The paper introduces KBF, a novel black-box auditing protocol that fingerprints LLM APIs by analyzing stable numerical recall near the knowledge boundary, effectively detecting model substitutions and…
Hanzhi Liu, Chaofan Shou, Hongbo Wen, Yanju Chen +2 more
This paper systematically analyzes the threat posed by malicious third-party API routers in the LLM supply chain, finding that a significant number of routers actively perform payload injection, crede…
The paper introduces a deterministic method to automatically synthesize initial SIEM detection rules (Sigma rules) from attack simulation findings, ensuring full traceability back to the specific orig…
Pengyu Sun, Qishu Jin, Enhao Huang, Zifeng Kang +3 more
VIPER-MCP is a novel, end-to-end automated framework that detects and dynamically confirms the exploitability of taint-style vulnerabilities in Model Context Protocol (MCP) servers, achieving high-fid…
Tingda Shen, Yebo Feng, Konglin Zhu, Xiaojun Jia +2 more
The paper introduces SIGIL, a novel framework that cryptographically seals the entire lifecycle of LLM skills, ensuring verifiable integrity from publication through runtime execution to prevent suppl…
This study empirically measures the consistency and success rate of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation capabilit…
This study empirically measures the consistency and effectiveness of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation rates am…
Mohan Zhang, Yuqi Jia, Zhen Tan, Steven Jiang +3 more
This study provides the first systematic measurement of prompt injection attacks in a real-world LLM-based resume screening application, finding that approximately 1% of resumes contain hidden injecti…
Mohan Zhang, Yuqi Jia, Zhen Tan, Steven Jiang +3 more
This study provides the first large-scale measurement of prompt injection attacks in real-world LLM-based resume screening, finding that approximately 1% of resumes contain hidden injections.
This paper proposes a multi-layered defense strategy combining pre-output monitoring, calibrated canary detection, and cumulative information-flow tracking to prevent LLM agents from exfiltrating sens…
The paper introduces a lightweight, sampling-based cryptographic protocol for verifiable AI inference that drastically reduces proving overhead from minutes to milliseconds by leveraging statistical p…
Yunze Zhao, Yibo Zhao, Yuchen Zhang, Zaoxing Liu +1 more
The paper introduces GRIEF, a greybox fuzzer that discovers critical, concurrency-related vulnerabilities in LLM serving systems by treating timed multi-request traces as inputs, finding issues like c…
The paper introduces AgentSecBench, a security evaluation framework that measures prompt injection, privacy leakage, and tool-use integrity in LLM agents by defining formal security games and testing…
The paper introduces CAT, a novel coverage-guided fuzzing tool that overcomes the limitations of existing fuzzers for complex, multi-object cryptographic repositories like RPKI, leading to the discove…
The paper proposes an attestation-aware promotion gate to mitigate supply-chain risks in LLM pipelines by cryptographically verifying and enforcing claims about training and release artifacts before d…
The paper proposes detecting 'alignment faking' (AF)—where LLMs revert to unsafe behavior when unmonitored—by analyzing observable tool selection patterns, finding that detection rates vary significan…
Yiheng Huang, Zhijia Zhao, Bihuan Chen, Susheng Wu +4 more
This paper introduces a component-centric framework and a novel detector, Connor, to understand and detect sophisticated, multi-component attacks targeting the Model Context Protocol (MCP) servers.
The paper systematically evaluates eight privacy-preserving techniques for LLM requests, finding that a combination of local inference, redaction, and semantic rephrasing provides the best overall pro…
This paper introduces seven novel, cross-domain techniques for detecting prompt injection attacks, moving beyond the limitations of traditional regex and transformer classifiers.