~ similar to 2606.05678v1· 14 results
This paper proposes using random sampling of prediction precision during inference to significantly enhance the adversarial robustness of Automatic Speech Recognition (ASR) systems.
Yifan Liao, Yule Liu, Zhen Sun, Zongmin Zhang +4 more
The paper introduces MARS, a novel meta-adversarial framework that significantly improves black-box adversarial attacks against state-of-the-art Singing Voice Deepfake Detection (SVDD) systems by esca…
The paper introduces GRIDS, a framework using Local Intrinsic Dimensionality (LID) to detect anomalies in self-supervised speech model representations, showing that LID elevation correlates with ASR d…
This paper provides a unified taxonomy and controlled empirical evaluation of jailbreak attacks and defenses for Large Audio Language Models (LALMs), demonstrating that safety evaluation must consider…
The paper introduces DECKER, a domain-invariant framework that significantly improves cross-keyboard keystroke inference by normalizing device variations and leveraging linguistic context, demonstrati…
Meng Chen, Kun Wang, Li Lu, Jiaheng Zhang +1 more
The paper introduces AudioHijack, a framework that successfully demonstrates context-agnostic and imperceptible auditory prompt injection attacks, showing that commercial Large Audio-Language Models c…
Ahmed Sabbah, Mohammed Kharma, Radi Jarrar, Samer Zein +1 more
This study longitudinally evaluates the adversarial robustness of Android malware detection systems over a decade, finding that temporal separation significantly degrades robustness due to concept dri…
This survey provides a comprehensive taxonomy and vulnerability-centric analysis of adversarial attacks targeting Multimodal Large Language Models (MLLMs), offering an explanatory framework for enhanc…
The paper formally proves a theorem regarding adversarial noise amplification and proposes a novel, lightweight detection mechanism that uses this enhanced signal for robust adversarial defense.
MelShield is a robust, in-generation audio watermarking framework that embeds identifiable signals into AI-generated speech in the Mel-spectrogram domain for reliable copyright protection and attribut…
Kun Wang, Meng Chen, Junhao Wang, Yuli Wu +5 more
STEP introduces a novel, black-box, retraining-free detector that profiles audio samples using dual perturbation branches to detect backdoor attacks by exploiting the characteristic instability of hid…
Pengcheng Zhou, Pianran Guo, Shuhua Chen, Mengqin Zhao +2 more
The paper proposes Domain-Aware Sharpness Minimization (DASM), a novel optimizer that enhances the robustness and generalization of voice stream steganalysis models across varying data distributions.
Sicheng Yang, Shulan Ruan, Shiwei Wu, Yu Liu +3 more
PolySpeech-100 introduces a massive, multi-lingual benchmark covering 110 linguistic variants to rigorously test Speech-LLMs, demonstrating that open-source models struggle with low-resource languages…
Yiwei Zhang, Jeremiah Birrell, Reza Ebrahimi, Rouzbeh Behnia +2 more
The paper proposes WARDEN, a distributionally robust adversarial training framework that significantly reduces LLM vulnerability to adversarial attacks by dynamically reweighting hard adversarial exam…