~ similar to 2605.30462· 17 results
The paper argues that the standard FID metric is unreliable because its performance depends significantly on the geometric structure and density of the reference dataset, not just the sample quality.
Yalun Dai, Yangyu Huang, Tongshen Yang, Yonghan Wang +7 more
This paper proposes four guidelines and two novel data ordering methods (STR and SAW) to systematically optimize data organization, significantly enhancing the stability and performance of LLM trainin…
The paper challenges the conclusion that LLMs lack reasoning by demonstrating that reported performance drops on GSM-Symbolic are often statistically weak and partially attributable to dataset biases,…
Zelin Guan, Shengda Zhuo, Zeyan Li, Jinchun He +3 more
E-MIA introduces a novel, stealthy black-box membership inference attack that converts verifiable hard evidence within a candidate document into an objective, multi-part exam score to determine if the…
The paper introduces CERA, a novel contrastive retrieval framework that improves RAG factuality and interpretability by using subjectivity-based hard negative selection and an auxiliary attention alig…
Chunlei Li, Zixuan Zheng, Yilei Shi, Guanglu Dong +4 more
The paper proposes a Signed Entropy Integral (SEI) statistic to detect mislabeled images in training datasets by analyzing the temporal trend of prediction entropy, achieving state-of-the-art results…
The paper demonstrates that clinical vision-language models (VLMs) pose a significant privacy risk by allowing de-identified images to be re-linked to original reports, and proposes a targeted differe…
This paper introduces the Data-Model Compatibility (DMC) metric to quantify how suitable a dataset is for reasoning distillation, showing that optimizing data selection using DMC significantly improve…
Bo Wang, Jia Ni, Mengnan Zhao, Zhan Qin +1 more
This paper systematically investigates unlearnable examples (UEs) across diverse training paradigms, finding that existing UEs fail under pretraining-finetuning (PF) settings, and proposes Shallow Sem…
The paper introduces the DECK taxonomy, a novel framework that classifies LLM hallucinations not by their content error, but by their detectability signature based on inter-sample consistency and toke…
This paper investigates why self-harm prediction models struggle to generalize across different hospitals, finding that variations in local lexical expression and feature importance are the primary ca…
This paper comparatively analyzes two automatic label error detection methods, Confident Learning and Dataset Cartography, demonstrating that targeted data filtering significantly improves model perfo…
The paper introduces a structured benchmark (TGAD) showing that current text-guided anomaly detection models often overstate their language conditioning, as performance significantly degrades when the…
Yung-Yu Shih, Shang-Yu Su, Tzu-I Ho, Dongzhe Wang +1 more
The paper presents BEATS, a human-in-the-loop LLM framework for bootstrapping product attribute taxonomies from scratch.
Yaxin Luo, Jiacheng Cui, Xiaohan Zhao, Xinyi Shang +4 more
The paper introduces LLMSurgeon, a framework that estimates the domain-level data mixture of a Large Language Model (LLM) using only generated text, thereby providing a post-hoc method to audit the mo…
The paper formalizes the problem of representation identifiability in supervised learning, showing that a representation property is identifiable if and only if it is constant across all possible fact…
PHANTOM is a novel framework that generates highly convincing, context-aware honeytokens by incorporating deep organizational knowledge, significantly improving their believability and detection resis…