~ similar to 2605.30179· 18 results
Shuheng Cao, Ruiqi Chen, Renjie Cao, Zhenhao Zhang +2 more
The paper introduces BioConCal, a supervised scoring mechanism that evaluates biomedical NER candidates surfaced by multiple LLMs, significantly improving the quality of the candidate pool for human c…
This paper evaluates the causal reasoning abilities of large language models and finds that they rely heavily on lexical pattern matching rather than structural reasoning.
The paper proposes GraD-IBD, a graph-based model that reformulates longitudinal ICD diagnosis codes into temporally directed graphs to efficiently and accurately detect the risk of Inflammatory Bowel…
The paper introduces a robust, four-mechanism LLM pipeline that generates auditable, evidence-grounded structured trait records for hundreds of thousands of diverse species across multiple taxa.
The paper proposes a zero-shot multi-label topic classification framework and finds that while knowledge graph augmentation improves performance for smaller language models, it offers diminishing retu…
The paper proposes SPHERE, a novel framework that uses large language models to create semantic user personas, enabling effective cross-domain recommendation knowledge transfer between completely disj…
mcp-proto-okn is a Python server that facilitates natural language access to complex scientific knowledge graphs, simplifying cross-domain knowledge analysis for biomedical research.
Han Liu, Shanghao Shi, Yevgeniy Vorobeychik, Chongjie Zhang +1 more
This paper demonstrates that adversarial perturbations possess a low-rank structure, and proposes a two-step method to leverage this property to significantly improve the efficiency and effectiveness…
Jiwon Kim, Maya Ajit, Sherry Gong, Soorya Ram Shimgekar +3 more
The paper introduces LLUMI, an open-source framework that improves LLM writing assistance for mental health support using community feedback, demonstrating comparable performance to proprietary models…
BIRDNet is a novel, sparse, and interpretable deep neural network that encodes Boolean implication knowledge mined directly from tabular data, achieving performance comparable to dense models while dr…
The paper introduces Influence-Guided Symbolic Regression (IGSR), a novel framework that uses granular influence scores to guide LLMs in efficiently searching for and discovering complex mathematical…
CSULoRA is a post-hoc method that corrects trained LoRA adapters by estimating a safety-aligned subspace and solving a penalized minimum-change problem to attenuate unsafe update directions while pres…
MOOSE-Copilot is a novel web-based framework that unifies scientific hypothesis discovery by formalizing human-AI interaction, significantly improving performance over autonomous LLM baselines.
The paper introduces NaRA, a noise-aware LoRA technique that dynamically adapts fine-tuning parameters based on the noise level during diffusion, significantly improving the performance of Diffusion L…
Frontier LLM-based agents can effectively overcome the manual bottleneck of phenotype annotation by achieving consistency comparable to human experts, significantly outperforming existing NLP tools.
HypothesisMed introduces an inference-time pipeline for biomedical question answering that improves model reliability and structured output generation by fusing multiple model outputs and diagnosing t…
Keyue Qiu, Yixin Wu, Lihao Wang, Yawen Ouyang +18 more
The paper introduces AMix-2, a novel protein-text foundation model that unifies protein understanding and sequence design by embedding both modalities in a shared token space, achieving state-of-the-a…
The paper introduces a novel, scalable framework to monitor and classify dataset usage within research literature, addressing the current lack of infrastructure for tracking data citations.