Papers similar to 2605.30462

~ similar to 2605.30462· 17 results

cs.CVcs.AIRecentMay 28, 2026

Rethinking FID Through the Geometry of the Reference Dataset

The paper argues that the standard FID metric is unreliable because its performance depends significantly on the geometric structure and density of the reference dataset, not just the sample quality.

View →

cs.AIcs.CLRecentMay 28, 2026

Demystifying Data Organization for Enhanced LLM Training

Yalun Dai, Yangyu Huang, Tongshen Yang, Yonghan Wang +7 more

This paper proposes four guidelines and two novel data ordering methods (STR and SAW) to systematically optimize data organization, significantly enhancing the stability and performance of LLM trainin…

View →

cs.AIcs.CLRecentMay 27, 2026

The Importance of Being Statistically Earnest: A Critical Re-evaluation of GSM-Symbolic

Dominika Agnieszka Długosz, Arlindo Oliveira, Natalia Díaz-Rodríguez

The paper challenges the conclusion that LLMs lack reasoning by demonstrating that reported performance drops on GSM-Symbolic are often statistically weak and partially attributable to dataset biases,…

View →

cs.CRcs.AIRecentMay 1, 2026

E-MIA: Exam-Style Black-Box Membership Inference Attacks against RAG Systems

Zelin Guan, Shengda Zhuo, Zeyan Li, Jinchun He +3 more

E-MIA introduces a novel, stealthy black-box membership inference attack that converts verifiable hard evidence within a candidate document into an objective, multi-part exam score to determine if the…

View →

cs.CLRecentMay 31, 2026

Beyond Topical Similarity: Contrastive Evidence Retrieval with Interpretable Attention Alignment in RAG

Francielle Vargas, João Robiatti, Diego Alves, Lucas Pascotti Valem +5 more

The paper introduces CERA, a novel contrastive retrieval framework that improves RAG factuality and interpretability by using subjectivity-based hard negative selection and an auxiliary attention alig…

View →

cs.CVcs.AIRecentMay 29, 2026

On Revisiting Entropy for Identifying Mislabeled Images

Chunlei Li, Zixuan Zheng, Yilei Shi, Guanglu Dong +4 more

The paper proposes a Signed Entropy Integral (SEI) statistic to detect mislabeled images in training datasets by analyzing the temporal trend of prediction entropy, achieving state-of-the-art results…

View →

cs.CVcs.AIcs.CLRecentJun 1, 2026

Cross-modal linkage risk in clinical vision-language models

Soroosh Tayebi Arasteh, Mahshad Lotfinia, Sven Nebelung, Daniel Truhn

The paper demonstrates that clinical vision-language models (VLMs) pose a significant privacy risk by allowing de-identified images to be re-linked to original reports, and proposes a targeted differe…

View →

cs.AIRecentMay 28, 2026

Tailoring the Curriculum: Student-Centered Reasoning Distillation via Dynamic Data-Model Compatibility

Jiahao Huang, Fei Cheng, Junfeng Jiang, Akiko Aizawa

This paper introduces the Data-Model Compatibility (DMC) metric to quantify how suitable a dataset is for reasoning distillation, showing that optimizing data selection using DMC significantly improve…

View →

cs.LGcs.AIcs.CRRecentApr 18, 2026

Channel-Level Semantic Perturbations: Unlearnable Examples for Diverse Training Paradigms

Bo Wang, Jia Ni, Mengnan Zhao, Zhan Qin +1 more

This paper systematically investigates unlearnable examples (UEs) across diverse training paradigms, finding that existing UEs fail under pretraining-finetuning (PF) settings, and proposes Shallow Sem…

View →

cs.CLRecentJun 1, 2026

DECK: A Consistency x Confidence Taxonomy of LLM Hallucinations

Mohit Singh Chauhan

The paper introduces the DECK taxonomy, a novel framework that classifies LLM hallucinations not by their content error, but by their detectability signature based on inter-sample consistency and toke…

View →

cs.CLRecentJun 1, 2026

Why Do Self-Harm Prediction Models Struggle to Generalise? Lexical and Semantic Variations in Emergency Department Triage Notes

Liuliu Chen, Mike Conway, Jo Robinson, Vlada Rozova

This paper investigates why self-harm prediction models struggle to generalize across different hospitals, finding that variations in local lexical expression and feature importance are the primary ca…

View →

cs.CLcs.AIcs.LGRecentMay 28, 2026

Data filtering methods for training language models

Egor Shevchenko, Elena Bruches

This paper comparatively analyzes two automatic label error detection methods, Confident Learning and Dataset Cartography, demonstrating that targeted data filtering significantly improves model perfo…

View →

cs.CVcs.AIcs.LGRecentJun 1, 2026

A Structured Benchmark for Text-Guided Anomaly Detection: When Language Stops Conditioning the Decision

Stefano Samele, Eugenio Lomurno, Teodora Jovanovic, Sanjay Shivakumar Manohar +2 more

The paper introduces a structured benchmark (TGAD) showing that current text-guided anomaly detection models often overstate their language conditioning, as performance significantly degrades when the…

View →

cs.IRcs.CLRecentJun 3, 2026

BEATS: Bootstrapping E-commerce Attribute Taxonomies for Search through Iterative Human-AI Collaboration

Yung-Yu Shih, Shang-Yu Su, Tzu-I Ho, Dongzhe Wang +1 more

The paper presents BEATS, a human-in-the-loop LLM framework for bootstrapping product attribute taxonomies from scratch.

View →

cs.CLcs.AIcs.LGRecentMay 28, 2026

LLMSurgeon: Diagnosing Data Mixture of Large Language Models

Yaxin Luo, Jiacheng Cui, Xiaohan Zhao, Xinyi Shang +4 more

The paper introduces LLMSurgeon, a framework that estimates the domain-level data mixture of a Large Language Model (LLM) using only generated text, thereby providing a post-hoc method to audit the mo…

View →

cs.LGcs.AIRecentMay 31, 2026

A Fiber Criterion for Representation Identifiability in Supervised Learning

Vasileios Sevetlidis

The paper formalizes the problem of representation identifiability in supervised learning, showing that a representation property is identifiable if and only if it is constant across all possible fact…

View →

cs.CRRecentMay 4, 2026

PHANTOM: Polymorphic Honeytoken Adaptation with Narrative-Tailored Organisational Mimicry

Abraham Itzhak Weinberg

PHANTOM is a novel framework that generates highly convincing, context-aware honeytokens by incorporating deep organizational knowledge, significantly improving their believability and detection resis…

View →