ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2605.27853· 20 results

cs.AIq-bio.QMRecentJun 1, 2026

AgentPLM: Agentic Protein Language Models with Reasoning-Augmented Decoding for Protein Sequence Design

Sahil Rahman, Maxx Richard Rahman

AgentPLM introduces a novel framework that enhances protein language models by integrating external biophysical tools and a specialized policy optimization, enabling active, reasoning-based protein se…

View →
cs.AIq-bio.BMRecentMay 30, 2026

Probe Before You Edit: Probing-Guided Molecular Optimization for LLM Agents in Structure-Based Drug Design

Zaifei Yang, Weiyu Chen, Yaqing Wang, James Kwok

The paper introduces PROBE, an optimization framework that guides LLM agents in structure-based drug design by performing controlled 'probe edits' to assess how molecular changes affect both binding a…

View →
cond-mat.mtrl-scics.AIcs.LGRecentMay 28, 2026

What drives performance in molecular MPNNs? An operator-level factorial benchmark

Panyu Jiao, Shuizhou Chen, Yiheng Shen, Yuyang Wang +2 more

The paper introduces an operator-level factorial benchmark for molecular MPNNs, finding that message construction (specifically concatenation-based mixing) is the primary determinant of performance, r…

View →
q-bio.BMcs.AIRecentJun 1, 2026

Demystifying Multimodal Biomolecular Co-design With Intrinsic Geodesic Coupling

Keyue Qiu, Xintong Wang, Zhilong Zhang, Hao Zhou +1 more

The paper introduces GeoCoupling, a framework that systematically optimizes the temporal coupling between heterogeneous modalities to improve the co-design of biomolecules, outperforming fixed synchro…

View →
cs.LGcs.AIRecentMay 31, 2026

Fine-Tuning Diffusion Models for Molecular Generation via Reinforcement Learning and Fast Sampling

Guang Lin, Shikui Tu, Lei Xu

The paper introduces FTDiff, a reinforcement learning fine-tuning framework that efficiently generates high-quality, drug-like molecules constrained by a target protein structure, outperforming existi…

View →
cs.AIRecentMay 27, 2026

MACReD: A Multi-Agent Collaborative Reasoning Framework for Reaction Diagram Parsing

Chuang Tang, Chenhao Lin, Yin Xu, Hao Wang +4 more

MACReD introduces a hierarchical multi-agent framework that achieves state-of-the-art performance in parsing complex chemical reaction diagrams by coordinating specialized agents for perception and gl…

View →
cond-mat.mtrl-scics.ETcs.LGRecentJun 1, 2026

Towards Automated Discovery: A Review of Generative Models, Multimodal Learning and Closed-Loop Workflows in Inverse Materials Design

Anand Babu, Rogério Almeida Gouvêa, Gian-Marco Rignanese

This review surveys advanced techniques—including generative models, multimodal learning, and closed-loop workflows—for automated inverse materials design, enabling the targeted discovery of novel cry…

View →
cs.AIcs.CLcs.LORecentMay 27, 2026

Satisfiability Solving with LLMs: A Matched-Pair Evaluation of Reasoning Capability

Leizhen Zhang, Shuhan Chen, Sheng Chen

The paper evaluates LLM reasoning on Boolean satisfiability (SAT) problems, concluding that conventional metrics are misleading and proposing a paired-formula protocol with Accurate Differentiation Ra…

View →
cs.AIRecentMay 27, 2026

AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation

Shanghua Gao, Ada Fang, Marinka Zitnik

AutoScientists introduces a decentralized, self-organizing team of AI agents that significantly improves long-running scientific experimentation by enabling parallel exploration and knowledge sharing.

View →
cs.AIRecentMay 28, 2026

ProjectionBench: Evaluating Scientific Hypothesis Generation in LLMs Under Progressive Information Disclosure

A. J. Lew, Y. Cao, M. J. Buehler

The paper introduces ProjectionBench, a novel benchmark that progressively discloses information to evaluate LLMs' ability to generate scientific hypotheses, demonstrating that advanced models like GP…

View →
cs.AIRecentMay 29, 2026

MAVEN: Improving Generalization in Agentic Tool Calling

Omkar Ghugarkar, Vishvesh Bhat, Muhammad Ahmed Mohsin, Asad Aali

The paper introduces MAVEN, a lightweight symbolic reasoning scaffold that significantly improves the generalization and end-to-end success rate of large language models in complex, multi-step tool-ca…

View →
cs.CLRecentMay 30, 2026

I-WebGenBench : Evaluating Interactivity in LLM-Generated Scientific Web Applications

Dasen Dai, Biao Wu, Meng Fang, Shuoqi Li +1 more

The paper introduces I-WebGenBench, a framework and benchmark that converts static scientific papers into executable, interactive web systems, allowing users to dynamically explore the paper's mechani…

View →
cs.AIcs.LGRecentMay 30, 2026

MOSAIC: Modular Orchestration for Structured Agentic Intelligence and Composition

Yifan Bao, Xinyu Xi, Xinyu Liu, Wen Ge +7 more

MOSAIC introduces a structured agentic framework that treats automated data science as a staged, context-grounded model selection problem, improving performance and traceability over traditional AutoM…

View →
cs.AIcond-mat.mtrl-sciRecentMay 29, 2026

Coupling Language Models with Physics-based Simulation for Synthesis of Inorganic Materials

Edward W. Staley, Tom Arbaugh, Michael Pekala, Alexander New +5 more

The paper proposes a novel hybrid framework that couples Large Language Models (LLMs) with simplified physics-based simulations to improve the synthesis planning of novel inorganic crystalline materia…

View →
cs.AIRecentMay 27, 2026

Do LLMs Build World Models From Text? A Multilingual Diagnostic of Spatial Reasoning

Zhikai Pan, Chih-Ting Liao, Chunrui Liu, Xi Xiao +4 more

The paper introduces a multilingual benchmark (MentalMap) to test if LLMs build internal spatial world models from text, finding a universal 'L3 reasoning cliff' suggesting that text-only working memo…

View →
cs.CERecentJun 1, 2026

Matter to Mechanism: A Benchmark for AI Co-Scientists in Materials and Battery Research

Shashwat Sourav, Tanjin. He, Maria K. Y. Chan, Anubhav Jain +1 more

The paper introduces 'Matter to Mechanism,' a novel benchmark designed to rigorously evaluate AI co-scientists' ability to generate plausible, mechanism-grounded solution hypotheses for complex materi…

View →
cs.AIRecentMay 28, 2026

OmniMatBench: A Human-Calibrated Multimodal Reasoning Benchmark Across 19 Materials Science Subfields

Wanhao Liu, Jiaqing Xie, Qian Tan, Weida Wang +9 more

The paper introduces OmniMatBench, a comprehensive, human-calibrated multimodal reasoning benchmark covering 19 materials science subfields, revealing that current multimodal language models (MLLMs) h…

View →
cs.CLcs.AIRecentJun 1, 2026

PlanarBench: Evaluating LLM Spatial Reasoning via Planar Graph Drawing

Oleksandr Nikitin

PlanarBench introduces a novel benchmark to test LLM spatial reasoning by requiring them to draw planar graphs as ASCII art from an edge list, finding that edge count is a stronger difficulty predicto…

View →
cs.AIRecentMay 31, 2026

Science Earth: Towards A Planet-Scale Operating System for AI-Native Scientific Discovery

Zhe Zhao, Haibin Wen, Yingcheng Wu, Jiaming Ma +9 more

The paper introduces Science Earth, a planet-scale scientific runtime that enables diverse, siloed AI capabilities to connect and collaborate dynamically, demonstrating that scientific discovery can b…

View →