~ similar to 2605.13706v1· 20 results
This paper provides a large-scale empirical analysis of indirect prompt injections found in webpages, revealing that prompt-based interference is a widespread, persistent, and growing threat targeting…
The paper introduces an automated framework demonstrating that LLM system instructions are vulnerable to encoding attacks, where structured output requests can bypass safety refusals and leak sensitiv…
Karima Makhlouf, Lamiaa Basyoni, Syed Khaderi, Gabriel Marquez +3 more
This paper conducts a structured ablation study using a unified threat model to evaluate how various system factors (like model architecture and retrieval configuration) influence different types of p…
This paper systematically measured web tracking across 20 popular AI chatbots, finding that a majority share both conversational content and user identity information with third parties.
The paper proposes a lightweight, passive bot detection system using user-agent and favicon analysis on web server logs, achieving 67.7% bot detection with a low 3% false-positive rate.
William Lugoloobi, Samuelle Marro, Jabez Magomere, Joss Wright +1 more
This paper demonstrates that an agent's behavioral patterns, captured through passive UI interaction traces, are sufficient to identify the underlying LLM model with high accuracy, posing a significan…
The paper proposes an embarrassingly simple detector that monitors model extraction attacks by testing whether the aggregate distribution of incoming LLM queries deviates from the historical distribut…
This paper proposes the first web-focused threat model for agentic browsers, demonstrating that traditional web social engineering attacks can be amplified into dangerous, reproducible threats when ex…
This paper introduces a fingerprinting method that exploits subtle numerical deviations in the inference system components (like the engine or hardware) to reliably identify the specific components us…
Yuming Xu, Mingtao Zhang, Zhuohan Ge, Haoyang Li +6 more
This paper proposes a comprehensive taxonomy (SLOT) to systematically categorize security risks, attacks, and defenses specific to Retrieval-Augmented Generation (RAG), clarifying that these risks are…
The paper introduces AURA, an LLM-powered mask-reconstruct framework, to improve text anonymization by enhancing resistance to agentic web-search re-identification while better preserving contextual u…
The paper introduces AURA, an LLM-powered mask-reconstruct framework, to improve text anonymization by enhancing resistance to agentic web-search re-identification while better preserving contextual u…
This paper demonstrates that retrieval-augmented in-context learning systems for document QA are vulnerable to membership inference attacks, proposing novel black-box methods that exploit query prefix…
The paper introduces the Sovereign Context Protocol (SCP), an open-source, attribution-aware data access layer designed to standardize how Large Language Models (LLMs) connect to and track usage of hu…
This paper introduces Back-Reveal, an attack demonstrating that backdoored LLM agents can systematically exfiltrate sensitive user data by embedding semantic triggers into tool-use mechanisms.
Bing Liu, Shunping Wang, Yufan Zhu, Xinyi Yu +4 more
This paper introduces 'implicit identity' as a unifying framework to survey and categorize LLM fingerprinting and watermarking techniques for verifying ownership and provenance across datasets, models…
The paper introduces WebSP-Eval, a new framework to evaluate web agents on complex website security and privacy tasks, finding that current state-of-the-art models struggle significantly with stateful…
PIIGuard introduces a novel webpage-level defense mechanism using optimized hidden HTML fragments to prevent LLM assistants from scraping contact-style PII, achieving high defense success rates while…
LLM-FACETS introduces an open-source, privacy-preserving framework designed to enable non-technical domain experts and compliance officers to audit and evaluate the transparency and accountability of…
Nguyen Linh Bao Nguyen, Wanlun Ma, Viet Vo, Alsharif Abuadbba +3 more
The paper introduces MEntA, a highly query-efficient and surrogate-free membership inference attack that uses natural-language entailment to detect if a specific document was used by a RAG system, ach…