~ similar to 2606.02386· 20 results
Keyue Qiu, Yixin Wu, Lihao Wang, Yawen Ouyang +18 more
The paper introduces AMix-2, a novel protein-text foundation model that unifies protein understanding and sequence design by embedding both modalities in a shared token space, achieving state-of-the-a…
MolLingo is a multi-agent system that significantly improves automated molecular design by integrating domain-specific chemical reasoning and structural context into LLMs, outperforming state-of-the-a…
Zhichen Tang, Zhengzheng Dang, Yulin Chen, Jixin Wu +2 more
EvoMD-LLM introduces a novel framework that models reactive molecular dynamics as a symbolic temporal language problem, enabling LLMs to accurately predict complex, time-evolving chemical processes.
Shangheng Du, Xiangchao Yan, Jinxin Shi, Zongsheng Cao +10 more
MLEvolve is a novel self-evolving multi-agent framework that enables LLM agents to discover and optimize machine learning algorithms for complex, long-horizon tasks.
The paper introduces PROBE, an optimization framework that guides LLM agents in structure-based drug design by performing controlled 'probe edits' to assess how molecular changes affect both binding a…
Aravind Mandiga, Guoming Li, Jin Lu, Ismailcem Budak Arpinar +2 more
The paper introduces ProtStructQA, an executable benchmark that tests protein structural reasoning by requiring language models to generate measurable 3D coordinates, revealing a capability-dependent…
AutoScientists introduces a decentralized, self-organizing team of AI agents that significantly improves long-running scientific experimentation by enabling parallel exploration and knowledge sharing.
Aditya Kumar, Zhihan Lei, Jerry Yan, Joshua W. Momo +5 more
The paper proposes a modular agent framework and novel learning methods to design and optimize practical, cost-effective, and controllable LLM-based agentic systems.
This study benchmarks token-optimized formats (TOON and TRON) against JSON in end-to-end agentic AI systems, finding that TRON significantly reduces token overhead with minimal performance degradation…
Yeqi Huang, Yue Chen, Yanwei Ye, Guanhao Su +1 more
The paper introduces Ryze, an automated system that synthesizes evidence-enriched Question-Answering (QA) pairs from raw biomedical papers, resulting in a specialized VLM (BioVLM-8B) that significantl…
The paper introduces RAG-Pref, a novel, training-free Retrieval Augmented Generation (RAG) method for preference alignment that significantly improves LLM refusal guardrails against agentic attacks wi…
Astrid van den Brandt, Kiroong Choe, Sehi L'Yi, Devin Lange +1 more
The paper evaluates various LLM-based agentic schemes for authoring complex, interactive, multiview genomics visualizations, finding that agentic iteration significantly improves visualization quality…
This study benchmarks four local LLMs for natural-language-to-SQL querying in biopharma manufacturing, finding that general-purpose code-tuned models like Llama 3.1 8B and Qwen 2.5 Coder 7B outperform…
The paper introduces Chunk-Level Guided Generation, a training-free method that uses an off-the-shelf large language model (LLM) as a process scorer to guide small model generation, achieving performa…
Yujie Luo, Xiangyuan Ru, Jingsheng Zheng, Jingjing Wang +9 more
The paper introduces Autonomous Agentic Data Engineering, demonstrating that LLMs can autonomously plan and optimize end-to-end data curation pipelines, leading to substantial performance gains in spe…
The paper introduces Sovereign Agentic Loops (SAL), a control-plane architecture that decouples LLM reasoning from system execution to enhance safety and reliability in real-world AI agents.
MOOSE-Copilot is a novel web-based framework that unifies scientific hypothesis discovery by formalizing human-AI interaction, significantly improving performance over autonomous LLM baselines.
Tong Ye, Hang Yu, Tengfei Ma, Xuhong Zhang +5 more
The paper introduces DOMINO, a novel inductive framework that synthesizes domain-specific data for LLMs using only reference examples, significantly improving performance on challenging, implicitly de…
Huiyu Xu, Zhibo Wang, Wenhui Zhang, Ziqi Zhu +3 more
The paper introduces LoopTrap, an automated red-teaming framework that demonstrates how malicious prompts can poison the termination judgment of LLM agents, causing unbounded computation.