~ similar to 2605.15249v1· 20 results
This paper provides the first comprehensive review of threats and defenses specifically targeting on-device AI inference, revealing a significant imbalance where certain attack types, like adversarial…
The paper introduces Landseer, a modular framework designed to systematically evaluate and compose multiple machine learning defenses to address complex, real-world security requirements.
The paper demonstrates that simpler, shallower Deep Neural Network architectures with reduced features and ReLU activations can inherently improve the robustness of ML-NIDS against gradient-based adve…
This paper proposes using adversarial training to proactively harden the k-means clustering classifier, thereby mitigating evasion attacks that threaten resource provisioning stability in fog computin…
The paper demonstrates that current defenses against malicious fine-tuning of foundation models are insufficient because they only address fixed attacks, and introduces a unified adaptive attack that…
FlowGuard introduces an identity-independent defense using flow matching to detect data-free model stealing attacks by identifying synthetic queries as out-of-distribution based on their lower-dimensi…
The paper introduces ParDef, a generalized defense mechanism that effectively mitigates various types of parameter attacks on deep neural networks while maintaining high performance.
The paper establishes a standardized security assessment framework and develops a multi-layered defensive system, demonstrating that systematic testing and external defenses are crucial for safe LLM d…
Ahmed Sabbah, Mohammed Kharma, Radi Jarrar, Samer Zein +1 more
This study longitudinally evaluates the adversarial robustness of Android malware detection systems over a decade, finding that temporal separation significantly degrades robustness due to concept dri…
AESOP introduces an adversarial attack that targets the entire execution path of deep learning pipelines, demonstrating that path-aware selection can inflate computational costs by orders of magnitude…
The paper proposes a unified closed-loop threat taxonomy to systematically analyze and defend foundation models by explicitly framing the bidirectional security interactions between data and models.
This paper demonstrates that current AI model extraction defenses, which assume attacks come from single sources, are easily bypassed by coordinated, distributed threat actors.
Xiaozhe Zhang, Chaozhuo Li, Hui Liu, Shaocheng Yan +3 more
The EvoSafety framework enhances LLM safety by externalizing attack and defense mechanisms, enabling persistent, transferable, and model-agnostic robustness against adversarial prompts.
Yunrui Yu, Xuxiang Feng, Pengda Qin, Pengyang Wang +4 more
The paper introduces Dummy-Aware Weighted Attack (DAWA), a novel evaluation method that significantly reduces the reported robustness of Dummy Classes-based defenses by simultaneously targeting both t…
The Cognitive Firewall is a hybrid edge-cloud defense architecture that significantly reduces the attack success rate of Indirect Prompt Injection against browser-based AI agents by combining local vi…
This paper demonstrates that neural operators used in digital twins for nuclear systems are highly vulnerable to undetectable, sparse adversarial perturbations, necessitating new robustness guarantees…
Islam Debicha, Tayeb Kenaza, Ishak Charfi, Salah Mosbah +2 more
This paper evaluates a novel black-box adversarial attack to demonstrate the vulnerability of ML-based IoT Intrusion Detection Systems (IDS) and proposes a robust defense mechanism to mitigate these e…
The paper proposes a general-purpose pipeline to train automated red teaming models capable of generating attacks for arbitrary adversarial goals, overcoming the limitations of current methods that ar…
The paper identifies a critical vulnerability, the Camouflage Detection Gap (CDG), where standard LLM injection detectors fail dramatically when malicious payloads mimic the target domain's language a…
Jiaqing Li, Zhibo Zhang, Shide Zhou, Yuxi Li +2 more
The paper introduces TrojanMerge, a framework demonstrating that model merging can be exploited to systematically compromise the safety alignment of multiple individually safe LLMs.