Papers similar to 2606.04221

~ similar to 2606.04221· 4 results

cs.SDcs.AIeess.ASRecentJun 1, 2026

Echo: A Joint-Embedding Predictive Architecture for Speaker Diarization and Speech Recognition in a Shared Latent Space

Echo is a joint-embedding predictive architecture that uses a single, pretrained ViT encoder to simultaneously perform speaker diarization, speech recognition, and dynamic source separation in a share…

View →

cs.SDcs.AIcs.MMRecentMay 27, 2026

EigeNet: Geometry-Informed Multi-Modal Learning for Few-shot Novel View RIR Prediction

Chong Jing, Zitong Lan, Junan Zhang, Zhizheng Wu

EigeNet introduces a geometry-informed multi-modal Transformer framework to achieve state-of-the-art few-shot novel view Room Impulse Response (RIR) prediction by effectively integrating spatial geome…

View →

cs.CLcs.AIRecentMay 31, 2026

DSL-LLaDA: Scaling Continuous Denoising to 8B Masked Diffusion LMs

Longxuan Yu, Yunshu Wu, Yu Fu, Siheng Xiong +4 more

The paper introduces DSL-LLaDA, a method that lightly adapts a pre-trained masked diffusion language model to perform continuous denoising in embedding space, significantly improving text generation q…

View →

cs.CRcs.LGRecentJun 2, 2026

Long-Term and Short-Term Transistor Aging in Deep Neural Networks: Impact and Mitigation

Alireza Sarmadi, Virinchi Roy Surabhi, Prashanth Krishnamurthy, Hussam Amrouch +2 more

This paper analyzes the impact of long-term and short-term transistor aging on Deep Neural Network (DNN) inference accuracy and proposes an aging-aware retraining methodology to maintain performance e…

View →