~ similar to 2606.00563· 20 results
Tim Nielen, Sameer Ambekar, Johannes Kiechle, Daniel M. Lang +1 more
This paper identifies prediction bias, a failure mode of entropy minimization in test-time adaptation, and proposes Distribution Shift Bias Reduction (DSBR) to stabilize adaptation and prevent model c…
Adaptive data selection significantly improves wearable prediction performance, particularly for individuals with poor baseline health metrics, suggesting that selective data sampling should be tailor…
This paper analyzes the poor performance of Meta-learning for Training-data Selection (MTS) and proposes that increasing the batch size and incorporating informative features can significantly improve…
The paper introduces the Causal Sensitivity Score (CSS), an interventional metric that reveals that standard coverage-based evaluations fail to detect critical responsiveness deficits in clinical LLMs…
The study systematically evaluated the utility loss of Cox regression under differential privacy (DP) using multiple datasets, finding that significant utility degradation occurs at standard DP levels…
The paper investigates predictive multiplicity and arbitrariness in recidivism risk assessment, finding that similarly accurate models often exhibit high predictive agreement, and proposes a simple po…
Giuliano Martinelli, Piriyakorn Piriyatamwong, Abelardo Carlos Martinez Lorenzo, Jasmin Baier +6 more
The paper introduces Query2Effect, a large-scale benchmark, and a two-step framework to predict causal effect sizes from natural language queries, showing that structured representation significantly…
The paper introduces a comprehensive benchmark to test if physics foundation models learn generalizable dynamics, finding that their performance is highly conditional and not universally general.
Junqi Liu, Salena Song, Yuhan Wang, Jiawei Mao +11 more
The paper introduces AutoMedBench, a novel workflow-aware benchmark that evaluates autonomous medical-AI agents across a five-stage research process, revealing that agents struggle most with validatio…
The paper proposes a Bayesian meta-learner to accurately predict the distribution of Alzheimer's disease progression scores for individuals, outperforming existing methods, especially for long-term pr…
This paper proposes using genetic programming (GP) to jointly evolve both the feature sets and the structure of survival trees, resulting in highly interpretable and high-performing shallow models for…
The paper proposes 'Think Fast, Talk Smart,' a pipeline that separates deterministic data analysis from LLM generation, showing that offloading recurring, structured tasks to code significantly improv…
The paper introduces an agentic, framework-based system to transform under-specified academic papers into standardized, comparable, and executable benchmarks for industrial Prognostics and Health Mana…
Li Zhang, Yuyuan Li, XiaoHua Feng, Jiaming Zhang +2 more
This paper addresses the challenge of achieving optimal fairness and accuracy simultaneously in multi-class classification by proposing novel in-processing and post-processing algorithms that converge…
The paper models healthcare mechanism design as program synthesis, demonstrating that an optimized, mixed-objective program can eliminate up-coding and reduce patient rejection while maintaining finan…
This paper diagnoses a bias-dominated shortcut in class-level machine unlearning, where forgetting is achieved by suppressing classification head biases, and proposes bias-aware mechanisms to mitigate…
This paper evaluates multiple LLMs (DeepSeek-R1, OpenBioLLM-Llama3, Qwen 3.5) for generating privacy-safe, high-quality synthetic mental health reports, demonstrating their effectiveness in expanding…
Qingchao Jiang, Zhenxuan Hou, Zhiying Zhu, Zhenxing Qian +2 more
The paper proposes EMSFD, an evidence-based decision modeling approach that enhances synthetic face detection reliability and generalizability by explicitly modeling class evidence and incorporating u…
The paper introduces Influence-Guided Symbolic Regression (IGSR), a novel framework that uses granular influence scores to guide LLMs in efficiently searching for and discovering complex mathematical…
The paper introduces COPF, an online framework that ensures deployment-stable counterfactual fairness in link recommendation systems operating on evolving graphs by monitoring and controlling group di…