ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2605.29625· 20 results

cs.CLRecentJun 1, 2026

AI as a Tool for Simulation-Based Experiments in Literary Studies

Matthew Wilkens

The paper outlines the potential for using generative AI to conduct large-scale, simulation-based experiments in literary studies, demonstrating initial results in generating constrained literary text…

View →
cs.AIcs.MARecentMay 28, 2026

AgentSchool: An LLM-Powered Multi-Agent Simulation for Education

Yulei Ye, Wenhao Li, Zhong Wen, Yunshu Huang +22 more

The paper introduces AgentSchool, an advanced LLM-powered multi-agent simulator that models learning as state transitions to provide a robust, ethically viable testbed for educational research and ped…

View →
cs.AIRecentMay 28, 2026

MINDGAMES: A Live Arena for Evaluating Social and Strategic Reasoning in Multi-Agent LLMs

Kevin Wang, Anna Thöni, Benjamin Kempinski, Bobby Cheng +49 more

The paper introduces Mindgames, a comprehensive multi-game arena for evaluating LLM agents' sustained social and strategic reasoning, demonstrating that current evaluations are limited by structural s…

View →
cs.AIcs.HCRecentMay 30, 2026

Interaction-Centered Intelligence: Toward Interaction as the Primary Unit of Analysis in Co-Creative AI and Human-AI Systems

Nicholas Davis

This paper proposes shifting the focus of AI research from isolated computational outputs to interaction dynamics, establishing 'Interaction-Centered Intelligence' as the primary framework for underst…

View →
cs.SEcs.AIcs.CVRecentMay 27, 2026

GUI Agents for Continual Game Generation

Yixu Huang, Bo Li, Na Li, Zhe Wang +7 more

The paper proposes using GUI agents, both as objective evaluators and subjective playtesters, to significantly improve the generation of playable games from prompts, demonstrating a 66.8% rubric pass-…

View →
cs.CLRecentMay 28, 2026

Can LLM Teams Play What? Where? When?

Anastasia Kotelnikova, Viktor Byzov, Maria Dolzhenkova, Evgeny Kotelnikov

This paper investigates if team-based interaction improves LLM performance on complex reasoning tasks (ChGK), finding that structured team strategies significantly boost accuracy by acting as error-fi…

View →
cs.AIRecentMay 31, 2026

SkillRevise: Improving LLM-Authored Agent Skills via Trace-Conditioned Skill Revision

Yuxuan Liu, Zhaochen Su, Lingyun Xie, Yuhao Zhang +10 more

SkillRevise is an execution-grounded framework that iteratively refines initial, imperfect LLM agent skills by diagnosing defects from execution evidence and applying empirically validated edits, sign…

View →
cs.CLRecentMay 30, 2026

IDEAFix: Evaluation Framework for Creative Defixation Prompting in LLMs

F. Carichon, S. Sharma, M. Girard, R. Rampa +1 more

The paper introduces IDEAFix, a systematic evaluation framework designed to analyze how structured prompting and task design influence the divergent thinking and originality of idea generation in LLMs…

View →
cs.HCcs.AIRecentMay 29, 2026

Agentic Authoring of Interactive Multiview Visualizations in Genomics

Astrid van den Brandt, Kiroong Choe, Sehi L'Yi, Devin Lange +1 more

The paper evaluates various LLM-based agentic schemes for authoring complex, interactive, multiview genomics visualizations, finding that agentic iteration significantly improves visualization quality…

View →
cs.AIcs.CLcs.LGRecentMay 29, 2026

COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

Tianyi Zhou, Dongrui Liu, Leitao Yuan, Jing Shao +1 more

COLLEAGUE.SKILL introduces an automated system that distills heterogeneous traces of human expertise and role-specific knowledge into portable, inspectable, and usable AI skill packages.

View →
cs.AIcs.CLRecentJun 1, 2026

AGENTCL: Toward Rigorous Evaluation of Continual Learning in Language Agents

Yiheng Shu, Bernal Jiménez Gutiérrez, Saisri Padmaja Jonnalagedda, Yuguang Yao +2 more

The paper introduces AGENTCL, a rigorous evaluation framework that uses controlled task streams to accurately measure an agent's ability to accumulate and reuse knowledge across multiple tasks, thereb…

View →
cs.AIcs.CLRecentMay 28, 2026

Teaching Values to Machines: Simulating Human-Like Behavior in LLMs

Asaf Yehudai, Naama Rozen, Ariel Gera

The paper successfully demonstrates that Large Language Models (LLMs) can be induced to adopt coherent, human-like value structures, showing strong alignment with human psychological patterns.

View →
cs.CLRecentMay 30, 2026

I-WebGenBench : Evaluating Interactivity in LLM-Generated Scientific Web Applications

Dasen Dai, Biao Wu, Meng Fang, Shuoqi Li +1 more

The paper introduces I-WebGenBench, a framework and benchmark that converts static scientific papers into executable, interactive web systems, allowing users to dynamically explore the paper's mechani…

View →
cs.AIcs.CRcs.CYRecentApr 16, 2026

Layered Mutability: Continuity and Governance in Persistent Self-Modifying Agents

Krti Tallam

The paper introduces 'layered mutability,' a framework for analyzing how persistent self-modifying AI agents drift away from intended behavior due to the accumulation of locally reasonable, uncoordina…

View →
cs.CLcs.AIRecentMay 27, 2026

ChildEval: When large language models meet children's personalities

Yanyan Luo, Xue Han, Chunxu Zhao, Ruiqiao Bai +4 more

The paper introduces ChildEval, a large-scale benchmark designed to systematically evaluate how well large language models can infer and follow complex, child-specific preferences during long-context…

View →
cs.MAcs.AIcs.CYRecentMay 30, 2026

Scaling Behavior of Single LLM-Driven Multi-Agent Systems

Jialing Li, Zhouhong Gu, Yin Cai, Hongwei Feng

This paper investigates the scaling behavior of homogeneous LLM-driven Multi-Agent Systems (MAS) and finds that performance exhibits diminishing returns due to coordination overhead, rather than scali…

View →
cs.AIRecentMay 30, 2026

CoMIC: Collaborative Memory and Insights Circulation for Long-Horizon LLM Agents in Cloud-Edge Systems

Yannan Wang, Longli Yang, Zhen Liu, Abhishek Kumar +1 more

CoMIC is a cloud-edge framework that enables resource-constrained LLM agents to successfully complete complex, long-horizon tasks by collaboratively sharing and refining memory and insights between lo…

View →
cs.CLcs.AIRecentMay 27, 2026

StoryLens: Preference-Aligned Story Rewriting via Context-Aware Narrative Enrichment

Hanwen Cui, Yuting Mei, Yuhang Fu, Dingyi Yang +1 more

The paper introduces STORYLENSWRITER, a novel framework that significantly improves personalized story rewriting by incorporating context-aware narrative enrichment, outperforming style-only adaptatio…

View →
cs.MAcs.AIRecentMay 28, 2026

Evolve as a Team: Collaborative Self-Evolution for LLM-based Multi-Agent Systems

Zhezheng Hao, Tianfu Wang, Huanshuo Dong, Ziyan Liu +6 more

The paper proposes Meta-Team, an experience-driven framework that enables multi-agent systems (MAS) to collaboratively self-evolve by transforming complex execution experiences into reusable improveme…

View →
cs.LGcs.AIRecentMay 29, 2026

Learning to Construct Practical Agentic Systems

Aditya Kumar, Zhihan Lei, Jerry Yan, Joshua W. Momo +5 more

The paper proposes a modular agent framework and novel learning methods to design and optimize practical, cost-effective, and controllable LLM-based agentic systems.

View →