Papers similar to 2605.30748

~ similar to 2605.30748· 11 results

cs.CLcs.AIRecentJun 1, 2026

SimSD: Simple Speculative Decoding in Diffusion Language Models

Junxia Cui, Haotian Ye, Runchu Tian, Hongcan Guo +8 more

The paper proposes SimSD, a plug-and-play speculative decoding algorithm that adapts diffusion language models (dLLMs) to achieve fast, token-level acceleration by restoring causal masking capabilitie…

View →

cs.LGcs.AIeess.ASRecentMay 31, 2026

MURMUR: An Efficient Inference System for Long-Form ASR

Wei-Tzu Lee, Keisuke Kamahori, Baris Kasikci

Murmur is an efficient inference system for long-form ASR that resolves the accuracy-latency trade-off by optimizing both inter-chunk processing and intra-chunk attention mechanisms.

View →

cs.LGcs.AIRecentJun 1, 2026

FLARE: Diffusion for Hybrid Language Model

Yuchen Zhu, Jing Shi, Chongjian Ge, Hao Tan +8 more

FLARE is a systematic conversion framework that enables a single checkpoint to support both autoregressive (AR) and diffusion-style parallel decoding for hybrid-attention large language models, achiev…

View →

cs.CLcs.AIRecentMay 31, 2026

DSL-LLaDA: Scaling Continuous Denoising to 8B Masked Diffusion LMs

Longxuan Yu, Yunshu Wu, Yu Fu, Siheng Xiong +4 more

The paper introduces DSL-LLaDA, a method that lightly adapts a pre-trained masked diffusion language model to perform continuous denoising in embedding space, significantly improving text generation q…

View →

cs.LGcs.AIRecentMay 28, 2026

BlockBatch: Multi-Scale Consensus Decoding for Efficient Diffusion Language Model Inference

Xiaoyou Wu, Cheng-Jhih Shih, Binfei Ji, Yong Liu +1 more

BlockBatch introduces a novel framework that efficiently accelerates diffusion language model (dLLM) inference by simultaneously executing multiple block-size branches for a single request, achieving…

View →

cs.CLRecentMay 29, 2026

Efficient Diffusion LLMs via Temporal-Spatial Parallel Decoding and Confidence Extrapolation

Zekai Li, Ji Liu, Yiqing Huang, Ziqiong Liu +2 more

The paper proposes a novel trace-aware decoding framework, combining Temporal-Spatial Parallel Decoding (TSPD) and Confidence Extrapolation (CE), to significantly accelerate the inference of diffusion…

View →

cs.CLcs.AIeess.ASRecentMay 31, 2026

PolySpeech-100: A Large-Scale Benchmark for Speech Understanding Across 100+ Languages and Dialects

Sicheng Yang, Shulan Ruan, Shiwei Wu, Yu Liu +3 more

PolySpeech-100 introduces a massive, multi-lingual benchmark covering 110 linguistic variants to rigorously test Speech-LLMs, demonstrating that open-source models struggle with low-resource languages…

View →

cs.AIcs.CVeess.ASRecentMay 27, 2026

Diffusion Large Language Models for Visual Speech Recognition

Jeong Hun Yeo, Chae Won Kim, Hyeongseop Rha, Yong Man Ro

The paper proposes DLLM-VSR, a novel Diffusion Large Language Model framework for Visual Speech Recognition, achieving state-of-the-art performance by treating transcription as iterative masked denois…

View →

cs.CLcs.AIRecentMay 28, 2026

DLM-SWAI: Steering Diffusion Language Models Before They Unmask

Hyeseon An, Yo-Sub Han

The paper introduces DLM-SWAI, a training-free method that effectively steers diffusion language models (DLMs) toward desired textual styles or properties by biasing the token distribution at each den…

View →

eess.AScs.AIRecentMay 29, 2026

OpenSTBench: Beyond Semantic Evaluation for Speech Translation

Yanjie An, Yuxiang Zhao, Yichi Zhang, Qixi Zheng +4 more

The paper introduces OpenSTBench, a unified, multidimensional evaluation framework designed to comprehensively compare heterogeneous speech translation systems by jointly assessing translation, speech…

View →

cs.AIRecentMay 30, 2026

TAPS: Target-Aware Prefix Tree Selection for Diffusion-Drafted Speculative Decoding

Zhuoyu Wang, Junnan Huang, Xinyu Chen

TAPS introduces a target-aware prefix selection method that optimizes the trade-off between draft tree acceptance and verification cost, achieving significant speedups in speculative decoding.

View →