ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

9 results for “speech enhancement”

CS papers only

Hybrid search: Keyword + semantic, ranked by combined score.ⓘ

Want pure semantic search? Try claim verification →

cs.SDcs.AReess.ASRecentJun 2, 2026

Feasibility of Time-Domain DNN-Based Speech Enhancement on Embedded FPGA for Hearing Aid

Feyisayo Olalere, Umut Altin, Kiki van der Heijden, Marcel van Gerven

This paper characterizes the gap between current DNN-based speech enhancement systems and hearing aid constraints, and proposes a lightweight architecture to meet these constraints.

View →
cs.SDEmpiricalRecentJun 12, 2026

Instantaneous Pitch Estimation via Wave-U-Net-Based Fundamental Waveform Enhancement

Junya Koguchi, Tomoki Koriyama

A Wave-U-Net model is trained to extract a fundamental waveform from input speech signals for accurate and robust instantaneous pitch estimation.

View →
cs.SDcs.AIcs.CRRecentJun 4, 2026

Beyond Waveform Robustness: Robust Feature-Vocoder Adversarial Attacks on Automatic Speech Recognition

Yifan Liao, Zongmin Zhang, Zhen Sun, Yuhui Sun +2 more

The paper introduces a novel Clean-Referenced Feature-Vocoder Attack, a black-box adversarial attack that perturbs high-level SSL feature representations instead of raw audio waveforms, achieving supe…

View →
cs.SDcs.AIRecentMay 29, 2026

MindVoice: Reconstructing Intelligible Speech from Non-invasive Neural Signals with Pretrained Priors

Guangyin Bao, Taiping Zeng, Jianfeng Feng, Xiangyang Xue

MindVoice is a neuro-to-speech framework that uses pretrained priors to disentangle and reconstruct intelligible speech from noisy, non-invasive neural signals, significantly outperforming existing me…

View →
cs.SDcs.CRRecentMay 2, 2026

MelShield: Robust Mel-Domain Audio Watermarking for Provenance Attribution of AI Generated Synthesized Speech

Yutong Jin, Qi Li, Lingshuang Liu, Jianbing Ni

MelShield is a robust, in-generation audio watermarking framework that embeds identifiable signals into AI-generated speech in the Mel-spectrogram domain for reliable copyright protection and attribut…

View →
cs.CRcs.SDRecentMay 28, 2026

Audio Pirates: Black-box Audio Watermark Removal via Diffusion Priors

Lingfeng Yao, Xincong Zhong, Chenpei Huang, Xuandong Zhao +5 more

The paper introduces DiffErase, a black-box attack that effectively removes inaudible audio watermarks while preserving perceptual quality by utilizing diffusion models.

View →
eess.AScs.CRcs.LGRecentMay 4, 2026

Dimensionality-Aware Anomaly Detection in Learned Representations of Self-Supervised Speech Models

Sandra Arcos-Holzinger, Sarah M. Erfani, James Bailey, Sanjeev Khudanpur

The paper introduces GRIDS, a framework using Local Intrinsic Dimensionality (LID) to detect anomalies in self-supervised speech model representations, showing that LID elevation correlates with ASR d…

View →
cs.CLcs.SDRecentMay 29, 2026

UniAudio-Token: Empowering Semantic Speech Tokenizers with General Audio Perception

Yuhan Song, Linhao Zhang, Aiwei Liu, Chuhan Wu +5 more

UniAudio-Token is a framework that enhances existing semantic speech tokenizers with general audio perception, allowing them to handle diverse audio types while maintaining high-fidelity speech capabi…

View →