~ similar to 2606.01444· 20 results
Weitong Qian, Beicheng Xu, Zhongao Xie, Bowen Fan +15 more
AutoSci is a memory-centric agentic system designed to automate the entire scientific research lifecycle by integrating structured memory, multi-stage execution, and continuous self-improvement.
The paper introduces ProjectionBench, a novel benchmark that progressively discloses information to evaluate LLMs' ability to generate scientific hypotheses, demonstrating that advanced models like GP…
Przemyslaw Biecek, Luca Longo, Jianlong Zhou, Thomas Fel +2 more
The paper advocates for the establishment of Model Science, a systematic discipline that moves beyond simple benchmarking to deeply analyze AI models' internal workings and failure modes.
The study compares agentic data retrieval using unstructured web data versus structured, semantically-annotated datasets, concluding that semantic metadata remains essential for high-precision, reliab…
The paper proposes an engineering framework, inspired by metamaterials physics, to quantify institutional coordination and predict civilizational stability in the age of AI.
Shashwat Sourav, Tanjin. He, Maria K. Y. Chan, Anubhav Jain +1 more
The paper introduces 'Matter to Mechanism,' a novel benchmark designed to rigorously evaluate AI co-scientists' ability to generate plausible, mechanism-grounded solution hypotheses for complex materi…
Zongsheng Cao, Bihao Zhan, Jinxin Shi, Jiong Wang +21 more
This paper introduces Agents-K1, an end-to-end knowledge orchestration pipeline that converts raw documents into agent-native scientific knowledge graphs.
MOOSE-Copilot is a novel web-based framework that unifies scientific hypothesis discovery by formalizing human-AI interaction, significantly improving performance over autonomous LLM baselines.
MOSAIC introduces a structured agentic framework that treats automated data science as a staged, context-grounded model selection problem, improving performance and traceability over traditional AutoM…
Zhe Zhao, Haibin Wen, Yingcheng Wu, Jiaming Ma +9 more
The paper introduces Science Earth, a planet-scale scientific runtime that enables diverse, siloed AI capabilities to connect and collaborate dynamically, demonstrating that scientific discovery can b…
The paper introduces 'layered mutability,' a framework for analyzing how persistent self-modifying AI agents drift away from intended behavior due to the accumulation of locally reasonable, uncoordina…
Ruiyi Zhang, Peijia Qin, Qi Cao, Li Zhang +1 more
The paper introduces AIBuildAI-2, a knowledge-enhanced agent that significantly improves the automatic building of AI models by integrating an external, evolving knowledge system, achieving state-of-t…
The paper introduces Iteris, an agentic research system, demonstrating its capability to generate numerical evidence, constructions, and proof drafts for open problems in computational mathematics, re…
This survey provides a comprehensive analysis of Reasoning Language Model (RLM) adoption across 28 scientific disciplines, revealing significant disparities in RLM maturity across different scientific…
AutoScientists introduces a decentralized, self-organizing team of AI agents that significantly improves long-running scientific experimentation by enabling parallel exploration and knowledge sharing.
This review surveys advanced techniques—including generative models, multimodal learning, and closed-loop workflows—for automated inverse materials design, enabling the targeted discovery of novel cry…
Xu Li, Hanzhe Tu, Xinyi Li, Kuncheng Zhao +2 more
EvoGens is an evolution-inspired framework that treats scientific idea generation as an evolutionary search, significantly boosting the novelty and diversity of generated research ideas compared to ex…
The paper proposes an empowerment-guided multi-agent system that uses semantic checkpoints and structured communication to ensure that complex scientific computing workflows maintain semantic consiste…
The paper introduces I-WebGenBench, a framework and benchmark that converts static scientific papers into executable, interactive web systems, allowing users to dynamically explore the paper's mechani…
This paper introduces ATLAS, an active learning framework for discovering interpretable behavioral models in cognitive science.