~ similar to 2605.21780v1· 20 results
The paper proposes a certifiably robust malware detection framework using randomized smoothing and feature ablation to guarantee detection accuracy against metamorphic evasion attacks.
The paper proposes Byz-Clip21-SGD2M, a novel algorithm that achieves high-probability convergence guarantees for Federated Learning by integrating robust aggregation, double momentum, and clipping, re…
Kaisheng Fan, Weizhe Zhang, Yishu Gao, Tegawendé F. Bissyandé +1 more
The paper introduces Tail-risk Intrinsic Geometric Smoothing (TIGS), a plug-and-play, inference-time defense that suppresses backdoor attacks on LLMs by structurally smoothing the attention mechanism…
This paper corrects the theoretical analysis of DP-SGD by identifying that common implementations, which use batch averaging, result in weaker privacy guarantees than previously reported.
This paper proposes a density-aware attack that constructs triggers by placing poisoned samples in low-density regions of the clean data distribution, achieving high attack success rates even after st…
The paper introduces Disrupt-and-Rectify Smoothing (DR-Smoothing), a novel two-stage defense mechanism that significantly improves LLM security against jailbreaking attacks by restoring disrupted inpu…
Hoang Tran, Jorge Ramirez, Jiayi Wang, Alberto Bocchinfuso +2 more
The paper proposes a novel exponential mechanism using quadratic approximations to fine-tune machine learning models on sensitive data while providing strong differential privacy guarantees.
The paper introduces 'mixture mechanisms,' a novel class of additive noise mechanisms that achieve approximate differential privacy by mixing multiple Gaussian distributions, resulting in lower noise…
The paper introduces 'mixture mechanisms,' a novel class of additive noise mechanisms that achieve differential privacy for real-valued queries, significantly reducing noise compared to the standard G…
The paper proposes a Quantitative Information Flow (QIF) framework to systematically and rigorously compare Local Differential Privacy (LDP) frequency estimation protocols, moving beyond simple $\vare…
The paper develops a general framework to exactly characterize the composition of mechanisms satisfying multiple differential privacy constraints, extending known results to arbitrary numbers of const…
The paper introduces a lightweight, sampling-based cryptographic protocol for verifiable AI inference that drastically reduces proving overhead from minutes to milliseconds by leveraging statistical p…
The paper introduces an optimal black-box auditing framework using Donsker-Varadhan estimators to estimate Rényi differential privacy (RDP) guarantees for machine learning algorithms.
The paper introduces Zero-Run privacy auditing, a post-hoc framework that allows for practical differential privacy evaluation of large, deployed models without requiring retraining or controlled data…
The paper introduces the PML envelope, a novel definition that provides a robust and operationally meaningful measure of information leakage about a secret, satisfying both post-processing robustness…
TADP-RME introduces a trust-adaptive differential privacy framework that enhances data system reliability by dynamically adjusting the privacy budget based on user trust and disrupting geometric struc…
This paper proposes two post-processing techniques, random selection and linear combination, to construct a model that satisfies any desired differential privacy level without retraining, given a set…
This paper analyzes the reliability of efficient membership inference attack (MIA) evaluation methods, demonstrating that standard aggregation techniques introduce biases that compromise accurate vuln…
The paper proposes CEAR, an ensemble-based method that combines empirical and certified defenses to achieve superior provable robustness against adversarial attacks in Deep Neural Networks.
The paper demonstrates that current defenses against malicious fine-tuning of foundation models are insufficient because they only address fixed attacks, and introduces a unified adaptive attack that…