~ similar to 2606.00370· 20 results
Haozhe Zhao, Shuzheng Si, Zhenhailong Wang, Zheng Wang +5 more
The paper introduces Crafter, a multi-agent harness that significantly improves the generation of editable, publication-quality scientific figures from diverse inputs, addressing the limitations of ex…
The paper introduces I-WebGenBench, a framework and benchmark that converts static scientific papers into executable, interactive web systems, allowing users to dynamically explore the paper's mechani…
Zongsheng Cao, Bihao Zhan, Jinxin Shi, Jiong Wang +21 more
This paper introduces Agents-K1, an end-to-end knowledge orchestration pipeline that converts raw documents into agent-native scientific knowledge graphs.
This paper proposes a multi-agent framework using LLMs to improve collaborative story generation, demonstrating that an iterative Writer-Editor process significantly enhances narrative quality for you…
Weitong Qian, Beicheng Xu, Zhongao Xie, Bowen Fan +15 more
AutoSci is a memory-centric agentic system designed to automate the entire scientific research lifecycle by integrating structured memory, multi-stage execution, and continuous self-improvement.
Lukas Johanns, Marilin Moor, Davide Panzeri, Yu Zhou +8 more
Agentic-J is a containerized, multi-agent AI assistant designed to enable biologists to perform complex, reproducible biological microscopy image analysis by specifying tasks in natural language.
The study compares agentic data retrieval using unstructured web data versus structured, semantically-annotated datasets, concluding that semantic metadata remains essential for high-precision, reliab…
AutoForest is an end-to-end system that automatically generates publication-ready forest plots directly from biomedical papers, streamlining the labor-intensive process of meta-analysis.
Frontier LLM-based agents can effectively overcome the manual bottleneck of phenotype annotation by achieving consistency comparable to human experts, significantly outperforming existing NLP tools.
MOSAIC introduces a structured agentic framework that treats automated data science as a staged, context-grounded model selection problem, improving performance and traceability over traditional AutoM…
VESTA introduces a novel agent framework that enhances Visual Language Models (VLMs) by equipping them with a dynamic, reusable toolkit of diagnostic and statistical tools, significantly improving aut…
Ruiyi Zhang, Peijia Qin, Qi Cao, Li Zhang +1 more
The paper introduces AIBuildAI-2, a knowledge-enhanced agent that significantly improves the automatic building of AI models by integrating an external, evolving knowledge system, achieving state-of-t…
Yujie Luo, Xiangyuan Ru, Jingsheng Zheng, Jingjing Wang +9 more
The paper introduces Autonomous Agentic Data Engineering, demonstrating that LLMs can autonomously plan and optimize end-to-end data curation pipelines, leading to substantial performance gains in spe…
Wenhang Shi, Jinhao Dong, Yiren Chen, Zhe Zhao +3 more
The paper introduces Grounded Agentic Interaction Synthesis (GAIS), a framework that generates high-quality, diverse, and complex agentic training data by anchoring tasks to real-world protocols, sign…
MOOSE-Copilot is a novel web-based framework that unifies scientific hypothesis discovery by formalizing human-AI interaction, significantly improving performance over autonomous LLM baselines.
AgentPLM introduces a novel framework that enhances protein language models by integrating external biophysical tools and a specialized policy optimization, enabling active, reasoning-based protein se…
The paper introduces AGENTCL, a rigorous evaluation framework that uses controlled task streams to accurately measure an agent's ability to accumulate and reuse knowledge across multiple tasks, thereb…
The paper proposes a category-theoretic framework for agentic AI that models scientific discovery not as answer generation, but as a verifiable transition and revision of the underlying representation…
This study benchmarks token-optimized formats (TOON and TRON) against JSON in end-to-end agentic AI systems, finding that TRON significantly reduces token overhead with minimal performance degradation…
This paper compares two agentic AI systems, Claude Code and Codex, on a gravitational wave data analysis pipeline, finding that while both achieve scientific convergence, they exhibit vastly different…