~ similar to 2605.30963· 19 results
AgentPLM introduces a novel framework that enhances protein language models by integrating external biophysical tools and a specialized policy optimization, enabling active, reasoning-based protein se…
Aravind Mandiga, Guoming Li, Jin Lu, Ismailcem Budak Arpinar +2 more
The paper introduces ProtStructQA, an executable benchmark that tests protein structural reasoning by requiring language models to generate measurable 3D coordinates, revealing a capability-dependent…
The authors introduce Structured PubMed, a comprehensive corpus of section-labeled biomedical abstracts compiled from the complete PubMed database.
Yeqi Huang, Yue Chen, Yanwei Ye, Guanhao Su +1 more
The paper introduces Ryze, an automated system that synthesizes evidence-enriched Question-Answering (QA) pairs from raw biomedical papers, resulting in a specialized VLM (BioVLM-8B) that significantl…
Keyue Qiu, Xintong Wang, Zhilong Zhang, Hao Zhou +1 more
The paper introduces GeoCoupling, a framework that systematically optimizes the temporal coupling between heterogeneous modalities to improve the co-design of biomolecules, outperforming fixed synchro…
This paper introduces BBOmix, an open-source benchmark for unsupervised representation learning on real-world biological data.
Kaihui Cheng, Zhiqiang Cai, Wenkai Xiang, Zhihang Hu +3 more
The paper introduces a history-dependent bias to generative protein emulators, significantly improving the exploration of rare and diverse protein states compared to standard emulators.
Zhichen Tang, Zhengzheng Dang, Yulin Chen, Jixin Wu +2 more
EvoMD-LLM introduces a novel framework that models reactive molecular dynamics as a symbolic temporal language problem, enabling LLMs to accurately predict complex, time-evolving chemical processes.
Xinyu Yuan, Xixian Liu, Jianan Zhao, Yashi Zhang +2 more
The paper introduces CORE, a contrastive evidence organization method, which significantly improves the accuracy of LLM-based predictions of gene expression changes following cellular perturbations by…
MolLingo is a multi-agent system that significantly improves automated molecular design by integrating domain-specific chemical reasoning and structural context into LLMs, outperforming state-of-the-a…
The paper proposes EPIC, an efficient and parallel decoding framework that significantly speeds up the process of constraining diffusion language model outputs using Context-Free Grammars (CFG).
The paper introduces the Vector Network (VN), a novel recurrent architecture that replaces fixed weight matrices with reusable weight atoms, enabling superior compositional generalization by making st…
The paper introduces Semantic Triplet Restoration (STR), a novel protocol that converts complex table structures into atomic semantic triplets, improving table question answering by providing explicit…
This study benchmarks four local LLMs for natural-language-to-SQL querying in biopharma manufacturing, finding that general-purpose code-tuned models like Llama 3.1 8B and Qwen 2.5 Coder 7B outperform…
Astrid van den Brandt, Kiroong Choe, Sehi L'Yi, Devin Lange +1 more
The paper evaluates various LLM-based agentic schemes for authoring complex, interactive, multiview genomics visualizations, finding that agentic iteration significantly improves visualization quality…
FuzzPilot is a controller for AFL++ that validates candidate mutation recipes by running short micro-campaigns, demonstrating a mechanism to manage fuzzing plateaus, though initial results on a satura…
BIRDNet is a novel, sparse, and interpretable deep neural network that encodes Boolean implication knowledge mined directly from tabular data, achieving performance comparable to dense models while dr…
The paper introduces Influence-Guided Symbolic Regression (IGSR), a novel framework that uses granular influence scores to guide LLMs in efficiently searching for and discovering complex mathematical…