~ similar to 2605.29591· 17 results
The paper introduces Brain-IT-VQA, a novel framework that significantly improves visual question answering from fMRI signals, and presents NSD-VQA, a new, highly controlled dataset for this task.
Yangxuan Zhou, Sha Zhao, Jiquan Wang, Shijian Li +1 more
EvoBrain proposes a dynamic, cross-task continual learning framework to overcome the limitations of task-specific EEG decoding, enabling unified and scalable brain-computer interfaces.
Yizhuo Lu, Changde Du, Qiongyi Zhou, Liuyun Jiang +1 more
The paper proposes MindDiffuser, a two-stage framework that significantly improves image reconstruction from brain activity by combining semantic guidance from text-to-image models with structural ref…
Garvin Guo, Yu Chen, Xiang Wang, Shuai Li +3 more
The paper deconstructs latent visual reasoning tokens into components and finds that the performance gains are primarily due to boundary markers and attention patterns, not the tokens' ability to enco…
MindVoice is a neuro-to-speech framework that uses pretrained priors to disentangle and reconstruct intelligible speech from noisy, non-invasive neural signals, significantly outperforming existing me…
The paper proposes a multi-dimensional evaluation framework to assess EEG foundation models under realistic low-resource conditions, finding that while these models excel in long-context tasks, their…
EVA-Net proposes a two-stage framework that uses action videos as semantic priors to achieve strong subject-independent EEG motor decoding, significantly outperforming text-based methods.
The paper proposes the Morlet Spectral Transformer (MST), a novel architecture that effectively decodes cross-subject emotion from EEG by designing specialized spectral and spatial representations, ou…
The paper demonstrates that the location and nature of state encoding in sequence models are not fixed architectural traits but are highly dependent on the specific task, showing that the encoding pro…
The paper analyzes token reduction for efficient unified VLM training, finding that while task-specific acceleration saves computation, it destroys the mutual performance gains achieved through joint…
Wanhao Liu, Jiaqing Xie, Qian Tan, Weida Wang +9 more
The paper introduces OmniMatBench, a comprehensive, human-calibrated multimodal reasoning benchmark covering 19 materials science subfields, revealing that current multimodal language models (MLLMs) h…
Dongping Chen, Xuanao Huang, Zhihan Hu, Qingyuan Shi +2 more
The paper demonstrates that specialized coding agents, using only text and image access within a sandbox, can effectively solve complex omnimodal tasks, often outperforming state-of-the-art native omn…
Xiang Li, Jiwei Wei, Ke Liu, Yitong Qin +4 more
The eMoT framework enhances multi-step reasoning in LLMs by treating reasoning as an evolving memory, stabilizing performance through symbolic computation and structured refinement.
This paper benchmarks five positional encoding strategies for transformer-based EEG foundation models, concluding that the optimal encoding is task-dependent and no single strategy is universally supe…
The paper identifies five persistent, deep-seated behavioral patterns ('training strata') in LLMs, observed through long-term, intimate human-AI interaction, suggesting that training artifacts survive…
The paper introduces BenHalluEval, the first dedicated multi-task framework for systematically evaluating hallucination in Large Language Models (LLMs) specifically for the Bengali language.
The paper introduces Reasoning in Memory (RiM), a latent reasoning method that replaces autoregressive token generation with fixed memory blocks to enable compute-efficient internal working memory for…