Papers similar to 2606.00570

~ similar to 2606.00570· 20 results

cs.CLcs.AIRecentMay 28, 2026

Towards Localized and Disentangled Knowledge Editing for Multimodal Large Language Models

Leijiang Gu, Zhen Zeng, Feng Li, Xinjian Gao +1 more

The paper proposes Localized and Disentangled Knowledge Editing (LDKE), a framework that significantly improves knowledge editing in Multimodal Large Language Models by ensuring edits are both precise…

View →

cs.AIRecentMay 31, 2026

AnyEdit++: Adaptive Long-Form Knowledge Editing via Bayesian Surprise

Bowen Tian, Caixue He, Jiemin Wu, Jingying Wang +3 more

AnyEdit++ introduces a structure-aware framework that uses Bayesian Surprise to adaptively segment long-form knowledge, significantly improving the coherence and accuracy of knowledge editing in LLMs.

View →

cs.AIRecentJun 1, 2026

Revisiting Ripple Effects in Knowledge Editing through Pressure-Aware Joint Neighborhood Optimization

Haoben Huang, Shuxin Liu, Ou Wu, Di Gao

The paper proposes Joint Neighborhood Optimization (JNO), a novel knowledge-editing framework that jointly addresses the coupled pressures of desirable knowledge propagation and unintended knowledge l…

View →

cs.AIcs.CRRecentMay 11, 2026

Benchmarking Safety Risks of Knowledge-Intensive Reasoning under Malicious Knowledge Editing

Qinghua Mao, Xi Lin, Jinze Gu, Jun Wu +2 more

The paper introduces EditRisk-Bench, a novel benchmark designed to systematically evaluate the safety risks and downstream reasoning corruption caused by malicious knowledge editing in large language…

View →

cs.AIcs.SERecentMay 28, 2026

ParaTool: Shifting Tool Representations from Context to Parameters

Zekai Yu, Qi Meng, Qizhi Chu, Yu Hao +2 more

ParaTool introduces a novel framework that shifts tool representations from bulky context documentation to dedicated, loadable parameters, enabling efficient and robust tool calling in LLMs.

View →

cs.CLcs.CVRecentMay 30, 2026

Do Text Edits Generalize to Visual Generation? Benchmarking Cross-Modal Knowledge Editing in UMMs

Xin Gao, Cheng Yang, Chufan Shi, Taylor Berg-Kirkpatrick

The paper introduces UniKE, a benchmark showing that successful knowledge edits in text-only multimodal models do not reliably transfer to image generation, revealing a significant modality gap.

View →

cs.AIcs.LGRecentMay 27, 2026

Zipping the Thought: When and How Compressed Reasoning Data Works in LLM Post-Training

Kohsei Matsutani, Gouki Minegishi, Takeshi Kojima, Yusuke Iwasawa +1 more

This paper investigates how different types of compressed reasoning data (Explicit, Composed, Implicit CoT) affect LLM performance during post-training, finding that the choice of compression and subs…

View →

cs.CLcs.AIcs.LGRecentMay 30, 2026

On the Limits of LLM Adaptability: Impact of Model-Internalized Priors on Annotation Task Performance

Etienne Casanova, Rafal Kocielnik, R. Michael Alvarez

The paper demonstrates that LLM performance in zero-shot annotation is significantly limited by the alignment between the model's internal understanding and the task definition, showing that prompt-ba…

View →

cs.CLRecentMay 29, 2026

Parameter Alignment Mitigates Catastrophic Forgetting in Multilingual Expert Language Models

Sanchit Ahuja, Terra Blevins

The paper introduces and evaluates five parameter alignment strategies that significantly mitigate catastrophic forgetting when continually pretraining multilingual expert language models across multi…

View →

cs.AIRecentMay 27, 2026

From Fact Overwriting to Knowledge Evolution: Causal Editing via On-Policy Self-Distillation

Shuaike Li, Kai Zhang, Xianquan Wang, Jiachen Liu +1 more

The paper introduces Causal Editing (CODE), a new paradigm that improves knowledge updates in LLMs by grounding fact injection in causal narratives, drastically reducing self-refutation rates.

View →

cs.CLRecentMay 31, 2026

On the Generalization Gap in Self-Evolving Language Model Reasoning

Zhenting Qi, Susanna Maria Baby, Stefanie Anna Baby, Kan Yuan +4 more

The paper investigates the limits of self-evolution in LLM reasoning under closed-loop settings, finding that while self-improvement is significant, it consistently falls short of perfect oracle super…

View →

cs.AIcs.CRRecentMay 18, 2026

Safety Geometry Collapse in Multimodal LLMs and Adaptive Drift Correction

Jiahe Guo, Xiangran Guo, Jiaxuan Chen, Weixiang Zhao +5 more

This paper introduces the concept of Safety Geometry Collapse, demonstrating that multimodal inputs degrade the safety separation of LLMs, and proposes ReGap, a training-free method that adaptively co…

View →

cs.AIcs.CLRecentMay 28, 2026

Demystifying Data Organization for Enhanced LLM Training

Yalun Dai, Yangyu Huang, Tongshen Yang, Yonghan Wang +7 more

This paper proposes four guidelines and two novel data ordering methods (STR and SAW) to systematically optimize data organization, significantly enhancing the stability and performance of LLM trainin…

View →

cs.AIRecentMay 27, 2026

Do LLMs Build World Models From Text? A Multilingual Diagnostic of Spatial Reasoning

Zhikai Pan, Chih-Ting Liao, Chunrui Liu, Xi Xiao +4 more

The paper introduces a multilingual benchmark (MentalMap) to test if LLMs build internal spatial world models from text, finding a universal 'L3 reasoning cliff' suggesting that text-only working memo…

View →

cs.AIcs.CLRecentMay 27, 2026

The Importance of Being Statistically Earnest: A Critical Re-evaluation of GSM-Symbolic

Dominika Agnieszka Długosz, Arlindo Oliveira, Natalia Díaz-Rodríguez

The paper challenges the conclusion that LLMs lack reasoning by demonstrating that reported performance drops on GSM-Symbolic are often statistically weak and partially attributable to dataset biases,…

View →

cs.CLcs.AIRecentMay 29, 2026

Language Models Learn Constructional Semantics, Not To Mention Syntax: Investigating LM Understanding of Paired-Focus Constructions

Wesley Scivetti, Ethan Wilcox, Nathan Schneider, Kanishka Misra +1 more

The paper investigates whether modestly sized open-source language models can grasp the semantics of rare Paired-Focus constructions, finding that understanding emerges later in training and correlate…

View →

cs.AIRecentMay 27, 2026

HRBench: Benchmarking and Understanding Thinking-Mode Switch Strategies in Hybrid-Reasoning LLMs

Yansong Ning, Mianpeng Liu, Jingwen Ye, Weidong Zhang +1 more

The paper introduces HRBench, a unified and comprehensive evaluation framework for systematically benchmarking and comparing various thinking-mode switching strategies in hybrid-reasoning LLMs.

View →

cs.CLRecentMay 28, 2026

When English Rewrites Local Knowledge: Global Narrative Dominance in Large Language Models

Md Arid Hasan, Ruwad Naswan, Farhan Samir, Sharifa Sultana +1 more

The paper demonstrates that using English prompts causes large language models to prioritize globally dominant narratives over local cultural knowledge, even when local evidence is provided.

View →

cs.CLRecentMay 29, 2026

dMoE: dLLMs with Learnable Block Experts

Sicheng Feng, Zigeng Chen, Gongfan Fang, Xinyin Ma +1 more

dMoE proposes a block-level Mixture-of-Experts (MoE) framework for Diffusion Large Language Models (dLLMs) that aggregates token-level expert distributions into a unified block-level distribution, sig…

View →

cs.LGcs.CRRecentMay 17, 2026

DP-SelFT: Differentially Private Selective Fine-Tuning for Large Language Models

Haichao Sha, Zihao Wang, Yuncheng Wu, Hong Chen +1 more

The paper proposes DP-SelFT, a novel framework for differentially private selective fine-tuning that significantly improves the privacy-utility trade-off for LLMs by intelligently selecting robust par…

View →