~ similar to 2606.01031· 9 results
The paper proposes a persona-based evaluation framework that replaces monolithic AI benchmarks with structured cognitive profiles to capture diverse human perspectives, while also identifying the chal…
Haitian Li, Yanghao Zhou, Heyan Huang, Liangji Chen +14 more
The paper introduces MTAVG-Bench 2.0, a new benchmark designed to diagnose high-level failure modes of cinematic expressiveness in multi-talker audio-video generation, showing that even advanced model…
Sicheng Yang, Shulan Ruan, Shiwei Wu, Yu Liu +3 more
PolySpeech-100 introduces a massive, multi-lingual benchmark covering 110 linguistic variants to rigorously test Speech-LLMs, demonstrating that open-source models struggle with low-resource languages…
This paper evaluates biases in multimodal speech recognition by testing how pairing different faces with the same audio affects transcription accuracy, finding significant quality-of-service drops acr…
JenBridge is a novel, adaptive framework that generates high-fidelity, long-form video soundtracks, significantly improving narrative coherence and naturalness across scene transitions.
This paper proposes a 3D CNN detector that leverages temporal artifacts to accurately identify high-quality deepfake videos, demonstrating robust detection even after social media re-encoding.
Xiaolin Liu, Yilun Zhu, Xiangyu Zhao, Xuehui Wang +8 more
The paper introduces Moment-Video, a new benchmark that diagnoses the ability of video MLLMs to understand brief, critical visual events, revealing that current models struggle significantly with temp…
Aniket Anand, Janvijay Singh, Zhewei Sun, Dilek Hakkani-Tür +1 more
The paper demonstrates that the AI-like style introduced by post-training alignment can be measured, localized, and causally removed using a novel ablation technique called PASTA.
The paper introduces MENTIS, a geometry-first framework that measures how preference alignment structurally changes the internal computations of language models, finding that these changes are selecti…