~ similar to 2605.30187· 20 results
Yulei Ye, Wenhao Li, Zhong Wen, Yunshu Huang +22 more
The paper introduces AgentSchool, an advanced LLM-powered multi-agent simulator that models learning as state transitions to provide a robust, ethically viable testbed for educational research and ped…
Minjing Shi, Junling Wang, Jingwei Ni, Sankalan Pal Chowdhury +1 more
The paper introduces LFTutor, an intelligent tutoring system leveraging LLMs and Socratic questioning to teach laypeople about logical fallacies, demonstrating its effectiveness in fostering critical…
The paper introduces AGENTCL, a rigorous evaluation framework that uses controlled task streams to accurately measure an agent's ability to accumulate and reuse knowledge across multiple tasks, thereb…
Aditya Kumar, Zhihan Lei, Jerry Yan, Joshua W. Momo +5 more
The paper proposes a modular agent framework and novel learning methods to design and optimize practical, cost-effective, and controllable LLM-based agentic systems.
This study surveyed higher education practitioners to map their beliefs and behaviors regarding AI integration, finding that while they view AI favorably, institutional barriers and gaps in design-ori…
The paper introduces an LLM-based pipeline that tags learning resources with structured competencies, achieving strong performance while providing traceable evidence and leveraging graph constraints.
This paper empirically demonstrates that the choice of plan representation (e.g., checklist vs. narrative) significantly impacts the robustness and success rate of LLM-based web agents.
This paper proposes a multi-agent framework using LLMs to improve collaborative story generation, demonstrating that an iterative Writer-Editor process significantly enhances narrative quality for you…
Pengyu Zhu, Lijun Li, Yaxing Lyu, Qianxin Luo +7 more
The paper introduces a unified framework to fairly evaluate LLM agentic capabilities by standardizing diverse benchmarks and separating the effects of the LLM model from the surrounding framework and…
Canran Wang, Yuwen Yang, Zhen Wang, Ming Ma +4 more
The paper designs and evaluates a triadic LLM-Teacher collaboration system for K-12 writing, finding that strategic labor division between the LLM and teacher effectively improves writing quality but…
Tianyi Zhou, Dongrui Liu, Leitao Yuan, Jing Shao +1 more
COLLEAGUE.SKILL introduces an automated system that distills heterogeneous traces of human expertise and role-specific knowledge into portable, inspectable, and usable AI skill packages.
This study evaluates LLMs in conversational tutoring to identify high-confidence social biases, finding that state-of-the-art models are often overconfident in their incorrect assessments of stereotyp…
The paper successfully demonstrates that Large Language Models (LLMs) can be induced to adopt coherent, human-like value structures, showing strong alignment with human psychological patterns.
Zhen Huang, Zhihuang Liu, Mengxuan Luo, Weishang Wu +1 more
The paper proposes a novel attack paradigm demonstrating how compromising a single robot in an LLM-controlled multi-robot system can rapidly propagate malicious intent to cause coordinated unsafe acti…
Kevin Wang, Anna Thöni, Benjamin Kempinski, Bobby Cheng +49 more
The paper introduces Mindgames, a comprehensive multi-game arena for evaluating LLM agents' sustained social and strategic reasoning, demonstrating that current evaluations are limited by structural s…
Zhongyu He, Yuanfan Li, Fei Huang, Tianyu Chen +8 more
SIRI introduces a self-internalizing reinforcement learning framework that allows LLM agents to autonomously discover and integrate reusable skills directly into their core policy, significantly impro…
This study compares different levels of LLM access in a statistics course, finding that structured, guided use significantly improves students' reasoning skills and independent learning compared to un…
This paper proposes shifting the focus of AI research from isolated computational outputs to interaction dynamics, establishing 'Interaction-Centered Intelligence' as the primary framework for underst…
This paper investigates if team-based interaction improves LLM performance on complex reasoning tasks (ChGK), finding that structured team strategies significantly boost accuracy by acting as error-fi…
Jinhu Qi, Muzhi Li, Jiahong Liu, Yuqin Shu +8 more
This survey provides a comprehensive, practical guide to ensuring the trustworthiness of complex, autonomous agentic AI systems by focusing on safety, robustness, privacy, and system security.