~ similar to 2606.14324· 5 results
Yifan Liao, Zongmin Zhang, Zhen Sun, Yuhui Sun +2 more
The paper introduces a novel Clean-Referenced Feature-Vocoder Attack, a black-box adversarial attack that perturbs high-level SSL feature representations instead of raw audio waveforms, achieving supe…
This paper characterizes the gap between current DNN-based speech enhancement systems and hearing aid constraints, and proposes a lightweight architecture to meet these constraints.
Jinnan Yang, Yan Wang, Zhen Bi, Kehao Wu +4 more
WaveFilter is a novel, training-free framework that uses wavelet transforms to efficiently filter critical tokens in the KV cache, significantly improving the long-context performance of Diffusion LLM…
Yuhan Song, Linhao Zhang, Aiwei Liu, Chuhan Wu +5 more
UniAudio-Token is a framework that enhances existing semantic speech tokenizers with general audio perception, allowing them to handle diverse audio types while maintaining high-fidelity speech capabi…
Yifan Liao, Yule Liu, Zhen Sun, Zongmin Zhang +4 more
The paper introduces MARS, a novel meta-adversarial framework that significantly improves black-box adversarial attacks against state-of-the-art Singing Voice Deepfake Detection (SVDD) systems by esca…