~ similar to 2606.11616· 20 results
Yongsik Seo, Wooseok Jeong, Eunyoung Kim, Hyeonseo Jang +1 more
The paper introduces CITETRACE, a large-scale dataset and evaluation framework that systematically measures structural citation failures in search-augmented LLMs, revealing a pattern called Verified M…
This paper systematically studies how soft errors propagate during Large Language Model (LLM) inference using a novel fault-injection framework, providing critical insights and mitigation strategies f…
The paper introduces an Item Response Theory (IRT)-based indicator that effectively identifies likely mislabeled items in existing LLM benchmarks, revealing systematic errors in labeling and model spe…
The paper introduces Deep Spurious Regression (DSR) to address spurious correlations in continuous prediction tasks, proposing a method that exploits attribute similarity in both feature and label spa…
This paper comparatively analyzes two automatic label error detection methods, Confident Learning and Dataset Cartography, demonstrating that targeted data filtering significantly improves model perfo…
Shuning Zhang, Eve He, Xiao Zhan, Shijing He +3 more
This paper investigates how Generative AI enables scalable, hyper-realistic fraud in Chinese e-commerce by fabricating product defect evidence, proposing new defense mechanisms like verifiable materia…
Nizar Islah, Istabrak Abbes, Irina Rish, Sarath Chandar +1 more
This paper proposes a method to recover recoverability structure from failed traces of post-trained language models, enabling test-time routing and post-training analysis.
Yaxin Luo, Jiacheng Cui, Xiaohan Zhao, Xinyi Shang +4 more
The paper introduces LLMSurgeon, a framework that estimates the domain-level data mixture of a Large Language Model (LLM) using only generated text, thereby providing a post-hoc method to audit the mo…
Zizhen Deng, Jiaru Zhang, Rui Ding, Huang Bojun +4 more
The paper proposes Test-Time Training for Supervised Causal Learning (TTT-SCL), a novel framework that dynamically generates training data aligned with specific test instances to significantly improve…
Xinle Deng, Ruobin Zhong, Hujin Peng, Xiaoben Lu +14 more
The paper introduces MemTrace, a framework that treats LLM memory pipelines as traceable graphs to systematically diagnose and automatically correct memory-related errors, boosting performance by up t…
Junbo Zhang, Qianli Zhou, Xinyang Deng, Wen Jiang +2 more
DataShield proposes an efficient method to identify safety-degrading samples within benign datasets, preventing the degradation of LLM safety capabilities during fine-tuning.
Junbo Zhang, Qianli Zhou, Xinyang Deng, Wen Jiang +2 more
DataShield proposes an efficient method to identify safety-degrading samples within benign datasets, quantifying each sample's contribution to an LLM's compliance behavior.
Mikhail L. Arbuzov, Lee Mosbacker, Sisong Bei, Ziwei Dong +2 more
The paper reframes LLM reliability from an impossible universal problem to a manageable, local patch-based problem, showing that sufficient interventions can be found by focusing on recurring failure…
FedDetox introduces a robust framework that sanitizes toxic data on edge devices during federated learning to maintain the safety alignment of Small Language Models (SLMs) without sacrificing utility.
TabChange proposes a novel framework to generate natural and minimally altered counterfactual instances in tabular data by precisely controlling attribute modifications based on their relationship str…
This paper proposes a method to improve error prediction for LLMs by explicitly disentangling input ambiguity from standard Uncertainty Quantification signals, showing that ambiguity information signi…
This systematic mapping survey reviews label-efficient approaches for code vulnerability detection, synthesizing five paradigm families and providing a decision guide to navigate trade-offs.
Md Nakhla Rafi, Md Ahasanuzzaman, Dong Jae Kim, Zhijie Wang +1 more
FALAT is a diagnostic framework that treats failure attribution in complex LLM agent trajectories as a dependency-guided search problem, successfully identifying both the responsible agent and the dec…
CB-SLICE is a novel concept-based method for discovering model error slices that leverages Concept Bottleneck Models (CBMs) to provide fine-grained, faithful explanations directly linked to the root c…
Pin Qian, Su Wang, Xiaoyuan Wang, Yihang Chen +6 more
The paper introduces FORCEBENCH, a new stress test designed to evaluate whether cited sources genuinely warrant the strength of a claim, revealing that standard citation evaluation methods often fail…