ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2605.28655· 20 results

cs.AIRecentMay 29, 2026

AutoSci: A Memory-Centric Agentic System for the Full Scientific Research Lifecycle

Weitong Qian, Beicheng Xu, Zhongao Xie, Bowen Fan +15 more

AutoSci is a memory-centric agentic system designed to automate the entire scientific research lifecycle by integrating structured memory, multi-stage execution, and continuous self-improvement.

View →
cs.AIRecentMay 27, 2026

AIBuildAI-2: A Knowledge-Enhanced Agent for Automatically Building AI Models

Ruiyi Zhang, Peijia Qin, Qi Cao, Li Zhang +1 more

The paper introduces AIBuildAI-2, a knowledge-enhanced agent that significantly improves the automatic building of AI models by integrating an external, evolving knowledge system, achieving state-of-t…

View →
cs.CLcs.AIRecentJun 1, 2026

AutoForest: Automatically Generating Forest Plots from Biomedical Studies with End-to-End Evidence Extraction and Synthesis

Massimiliano Pronesti, Angelo Miculescu, Mohsin Kapdi, Paul Flanagan +7 more

AutoForest is an end-to-end system that automatically generates publication-ready forest plots directly from biomedical papers, streamlining the labor-intensive process of meta-analysis.

View →
cs.AIRecentMay 31, 2026

The Case for Model Science: Verify, Explore, Steer, Refine

Przemyslaw Biecek, Luca Longo, Jianlong Zhou, Thomas Fel +2 more

The paper advocates for the establishment of Model Science, a systematic discipline that moves beyond simple benchmarking to deeply analyze AI models' internal workings and failure modes.

View →
cs.AIRecentMay 27, 2026

MolLingo: Molecule-Native Representations for LLM-Powered Scientific Agents

Thao Nguyen, Heng Ji

MolLingo is a multi-agent system that significantly improves automated molecular design by integrating domain-specific chemical reasoning and structural context into LLMs, outperforming state-of-the-a…

View →
cs.CLcs.AIcs.CERecentMay 28, 2026

MOOSE-Copilot: A Web-Based Interactive Assistant for Unified Exploratory and Fine-Grained Scientific Hypothesis Discovery

Hongran An, Zonglin Yang

MOOSE-Copilot is a novel web-based framework that unifies scientific hypothesis discovery by formalizing human-AI interaction, significantly improving performance over autonomous LLM baselines.

View →
cs.AIRecentMay 31, 2026

Science Earth: Towards A Planet-Scale Operating System for AI-Native Scientific Discovery

Zhe Zhao, Haibin Wen, Yingcheng Wu, Jiaming Ma +9 more

The paper introduces Science Earth, a planet-scale scientific runtime that enables diverse, siloed AI capabilities to connect and collaborate dynamically, demonstrating that scientific discovery can b…

View →
cs.AIRecentJun 1, 2026

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

Junqi Liu, Salena Song, Yuhan Wang, Jiawei Mao +11 more

The paper introduces AutoMedBench, a novel workflow-aware benchmark that evaluates autonomous medical-AI agents across a five-stage research process, revealing that agents struggle most with validatio…

View →
cs.AIRecentMay 30, 2026

Ryze: Evidence-Enriched Data Synthesis from Biomedical Papers

Yeqi Huang, Yue Chen, Yanwei Ye, Guanhao Su +1 more

The paper introduces Ryze, an automated system that synthesizes evidence-enriched Question-Answering (QA) pairs from raw biomedical papers, resulting in a specialized VLM (BioVLM-8B) that significantl…

View →
cs.AIq-bio.QMRecentJun 1, 2026

AgentPLM: Agentic Protein Language Models with Reasoning-Augmented Decoding for Protein Sequence Design

Sahil Rahman, Maxx Richard Rahman

AgentPLM introduces a novel framework that enhances protein language models by integrating external biophysical tools and a specialized policy optimization, enabling active, reasoning-based protein se…

View →
cs.LGcs.AIEmpiricalRecentJun 10, 2026

ATLAS: Active Theory Learning for Automated Science

Noémi Éltető, Nathaniel D. Daw, Kimberly L. Stachenfeld, Kevin J. Miller

This paper introduces ATLAS, an active learning framework for discovering interpretable behavioral models in cognitive science.

View →
cs.AIRecentMay 27, 2026

Verifiable Benchmarking of Long-Horizon Spatial Biology

Ian Diks, Harihara Muralidharan, Tim Proctor, Kenny Workman

The paper introduces SpatialBench-Long, a comprehensive benchmark designed to test AI agents' ability to perform end-to-end scientific reasoning and derive biological claims from complex, raw spatial…

View →
cs.AIcond-mat.mtrl-scics.CLRecentMay 31, 2026

Self-Revising Discovery Systems for Science: A Categorical Framework for Agentic Artificial Intelligence

Fiona Y. Wang, Markus J. Buehler

The paper proposes a category-theoretic framework for agentic AI that models scientific discovery not as answer generation, but as a verifiable transition and revision of the underlying representation…

View →
cs.CERecentJun 1, 2026

Matter to Mechanism: A Benchmark for AI Co-Scientists in Materials and Battery Research

Shashwat Sourav, Tanjin. He, Maria K. Y. Chan, Anubhav Jain +1 more

The paper introduces 'Matter to Mechanism,' a novel benchmark designed to rigorously evaluate AI co-scientists' ability to generate plausible, mechanism-grounded solution hypotheses for complex materi…

View →
cs.AIcs.CLEmpiricalRecentJun 11, 2026

EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery

Amy Xin, Jiening Siow, Junjie Wang, Zijun Yao +4 more

This paper presents EurekAgent, an environment-engineered agent system for metric-driven autonomous scientific discovery.

View →
cs.MAcs.AIcs.LGRecentMay 28, 2026

Discovering Cooperative Pipelines: Autoresearch for Sequential Social Dilemmas

Víctor Gallego

The paper introduces an outer-loop AI agent that autonomously redesigns LLM policy-synthesis pipelines for multi-agent social dilemmas, demonstrating that the optimal pipeline structure depends critic…

View →
cs.AIRecentMay 28, 2026

ProjectionBench: Evaluating Scientific Hypothesis Generation in LLMs Under Progressive Information Disclosure

A. J. Lew, Y. Cao, M. J. Buehler

The paper introduces ProjectionBench, a novel benchmark that progressively discloses information to evaluate LLMs' ability to generate scientific hypotheses, demonstrating that advanced models like GP…

View →
cs.CVcs.AIcs.CLRecentMay 28, 2026

Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

Haozhe Zhao, Shuzheng Si, Zhenhailong Wang, Zheng Wang +5 more

The paper introduces Crafter, a multi-agent harness that significantly improves the generation of editable, publication-quality scientific figures from diverse inputs, addressing the limitations of ex…

View →
cs.AIcs.LGRecentMay 30, 2026

MOSAIC: Modular Orchestration for Structured Agentic Intelligence and Composition

Yifan Bao, Xinyu Xi, Xinyu Liu, Wen Ge +7 more

MOSAIC introduces a structured agentic framework that treats automated data science as a staged, context-grounded model selection problem, improving performance and traceability over traditional AutoM…

View →
cs.MAcs.AIcs.CVRecentJun 1, 2026

Agentic-J: An AI Agent for Biological Microscopy Image Analysis

Lukas Johanns, Marilin Moor, Davide Panzeri, Yu Zhou +8 more

Agentic-J is a containerized, multi-agent AI assistant designed to enable biologists to perform complex, reproducible biological microscopy image analysis by specifying tasks in natural language.

View →