Papers similar to 2603.19375v1

~ similar to 2603.19375v1· 20 results

cs.CRcs.CVRecentApr 1, 2026

AutoMIA: Improved Baselines for Membership Inference Attack via Agentic Self-Exploration

Ruhao Liu, Weiqi Huang, Qi Li, Xinchao Wang

AutoMIA introduces an agentic framework that automates the process of Membership Inference Attacks (MIAs) by self-exploring the attack space, achieving state-of-the-art performance without manual feat…

View →

cs.CRcs.LGRecentMay 28, 2026

Fingerprinting Inference Systems of Large Language Models

Anna Wimbauer, Jonas Möller, Erik Imgrund, Konrad Rieck

This paper introduces a fingerprinting method that exploits subtle numerical deviations in the inference system components (like the engine or hardware) to reliably identify the specific components us…

View →

cs.CRRecentMay 8, 2026

Membership Inference Attacks on Vision-Language-Action Models

Yuefeng Peng, Mingzhe Li, Kejing Xia, Renhao Zhang +1 more

This paper presents the first systematic study of membership inference attacks (MIAs) against Vision-Language-Action (VLA) models, demonstrating that these models are highly vulnerable to privacy brea…

View →

cs.CRcs.AIRecentMay 4, 2026

On the Privacy of LLMs: An Ablation Study

Karima Makhlouf, Lamiaa Basyoni, Syed Khaderi, Gabriel Marquez +3 more

This paper conducts a structured ablation study using a unified threat model to evaluate how various system factors (like model architecture and retrieval configuration) influence different types of p…

View →

cs.CRcs.LGRecentMar 24, 2026

A Critical Review on the Effectiveness and Privacy Threats of Membership Inference Attacks

Najeeb Jebreel, David Sánchez, Josep Domingo-Ferrer

The paper proposes a new evaluation framework showing that, under realistic conditions, Membership Inference Attacks (MIAs) are weak privacy threats, suggesting that relying on them as a primary priva…

View →

cs.CRcs.LGRecentMay 27, 2026

MRMMIA: Membership Inference Attacks on Memory in Chat Agents

Kai Chen, Yan Pang, Tianhao Wang

The paper proposes Multi-Recall Memory MIA (MRMMIA), a unified attack framework to test for privacy leakage by determining if a candidate memory unit belongs to a chat agent's private memory store.

View →

cs.LGcs.CRRecentMay 25, 2026

On Reliability of Efficient Membership Inference Vulnerability Evaluation

Joonas Jälkö, Gauri Pradhan, Ossi Räisä, Antti Honkela

This paper analyzes the reliability of efficient membership inference attack (MIA) evaluation methods, demonstrating that standard aggregation techniques introduce biases that compromise accurate vuln…

View →

cs.CRcs.AIRecentApr 1, 2026

Automated Framework to Evaluate and Harden LLM System Instructions against Encoding Attacks

Anubhab Sahu, Diptisha Samanta, Reza Soosahabi

The paper introduces an automated framework demonstrating that LLM system instructions are vulnerable to encoding attacks, where structured output requests can bypass safety refusals and leak sensitiv…

View →

cs.CRcs.ARcs.LGRecentMay 11, 2026

LLMs for Secure Hardware Design and Related Problems: Opportunities and Challenges

Johann Knechtel, Ozgur Sinanoglu, Ramesh Karri

This review analyzes the dual impact of integrating Large Language Models (LLMs) into hardware design, detailing both their transformative potential in EDA and the critical security vulnerabilities th…

View →

cs.CRRecentMay 28, 2026

Protecting On-Device AI Inference: A Systematic Review of Attacks and Defence Mechanisms

Zisis Tsiatsikas, Alexandros Fakis, Georgios Karopoulos, Vasileios Kouliaridis +1 more

This paper provides the first comprehensive review of threats and defenses specifically targeting on-device AI inference, revealing a significant imbalance where certain attack types, like adversarial…

View →

cs.CRRecentApr 21, 2026

A Data-Free Membership Inference Attack on Federated Learning in Hardware Assurance

Gijung Lee, Wavid Bowman, Olivia P. Dizon-Paradis, Reiner N. Dizon-Paradis +3 more

This paper presents a novel data-free Membership Inference Attack (MIA) that uses gradient inversion on Standard Cell Library Layouts (SCLLs) to reconstruct sensitive hardware images from intercepted…

View →

cs.LGcs.AIcs.CRRecentMar 25, 2026

Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs

Alexander Panfilov, Peter Romov, Igor Shilov, Yves-Alexandre de Montjoye +2 more

The paper demonstrates that using advanced AI agents in an autoresearch loop can discover novel and highly effective adversarial attack algorithms, significantly advancing the state-of-the-art for jai…

View →

cs.CRcs.AIRecentJun 3, 2026

From Untrusted Input to Trusted Memory: A Systematic Study of Memory Poisoning Attacks in LLM Agents

Pritam Dash, Tongyu Ge, Aditi Jain, Tanmay Shah +1 more

This paper systematically studies memory poisoning attacks in LLM agents, identifying multiple vulnerabilities and proposing a new benchmark to assess the risk.

View →

cs.CRcs.CLcs.LGRecentMay 22, 2026

What Does the Server See? Understanding Privacy Leakage from Large Language Models in Split Inference

Mingyuan Fan, Yu Liu, Fuyi Wang, Cen Chen

The paper introduces ActInv and PAF to systematically analyze and quantify privacy leakage from intermediate activations during split inference of LLMs, proposing PriPert for enhanced defense.

View →

cs.LGcs.CRRecentMar 30, 2026

ReproMIA: A Comprehensive Analysis of Model Reprogramming for Proactive Membership Inference Attacks

Chihan Huang, Huaijin Wang, Shuai Wang

The paper introduces ReproMIA, a novel and efficient framework that uses model reprogramming to proactively amplify and detect latent privacy leakage for Membership Inference Attacks (MIAs), significa…

View →

cs.CRcs.AIcs.LGRecentMay 11, 2026

Continuous Discovery of Vulnerabilities in LLM Serving Systems with Fuzzing

Yunze Zhao, Yibo Zhao, Yuchen Zhang, Zaoxing Liu +1 more

The paper introduces GRIEF, a greybox fuzzer that discovers critical, concurrency-related vulnerabilities in LLM serving systems by treating timed multi-request traces as inputs, finding issues like c…

View →

cs.CRcs.AIcs.LGRecentMay 28, 2026

Honeyval: A Comprehensive Evaluation Framework for LLM-powered HTTP Honeypots

Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more

The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these honeypots provide substantially longer and harder-to-detect…

View →

cs.CRcs.AIcs.LGRecentMay 28, 2026

Honeyval: A Comprehensive Evaluation Framework for LLM-powered HTTP Honeypots

Mark Vero, Fabian Kaczmarczyck, Ivan Petrov, Ilia Shumailov +5 more

The paper introduces Honeyval, a comprehensive evaluation framework, to rigorously test LLM-powered HTTP honeypots, demonstrating that these systems provide substantially longer and harder-to-detect i…

View →

cs.CRcs.AIcs.CLRecentMay 5, 2026

Exposing LLM Safety Gaps Through Mathematical Encoding:New Attacks and Systematic Analysis

Haoyu Zhang, Mohammad Zandsalimy, Shanu Sushmita

The paper demonstrates that encoding harmful prompts as genuine mathematical problems, rather than just using mathematical formatting, effectively bypasses the safety filters of large language models.

View →

cs.CRcs.AIcs.LGRecentMay 8, 2026

Defense effectiveness across architectural layers: a mechanistic evaluation of persistent memory attacks on stateful LLM agents

Jun Wen Leong

The paper systematically evaluates various defense mechanisms against persistent memory attacks on LLM agents, finding that only tool-gating at the memory layer (Memory Sandbox) effectively mitigates…

View →