~ similar to 2606.00807· 20 results
This paper proposes a multi-agent framework using LLMs to improve collaborative story generation, demonstrating that an iterative Writer-Editor process significantly enhances narrative quality for you…
The paper proposes viewing national AI development, specifically in France, as a 'national AI learning system' governed by a controlled balance between information injection and entropy dissipation, a…
Yulei Ye, Wenhao Li, Zhong Wen, Yunshu Huang +22 more
The paper introduces AgentSchool, an advanced LLM-powered multi-agent simulator that models learning as state transitions to provide a robust, ethically viable testbed for educational research and ped…
The paper introduces the Tacit Understanding Index (TUX) to measure non-explicit alignment between humans and LLMs, finding that this alignment is significantly structured by individual person-level t…
F. Carichon, S. Sharma, M. Girard, R. Rampa +1 more
The paper introduces IDEAFix, a systematic evaluation framework designed to analyze how structured prompting and task design influence the divergent thinking and originality of idea generation in LLMs…
The paper outlines the potential for using generative AI to conduct large-scale, simulation-based experiments in literary studies, demonstrating initial results in generating constrained literary text…
The paper identifies five persistent, deep-seated behavioral patterns ('training strata') in LLMs, observed through long-term, intimate human-AI interaction, suggesting that training artifacts survive…
This study analyzes ClinicalTrials.gov records to track the rising trend of AI in clinical trials and demonstrates that a hybrid human-AI screening approach is viable but requires clearer reporting of…
This paper investigates the scaling behavior of homogeneous LLM-driven Multi-Agent Systems (MAS) and finds that performance exhibits diminishing returns due to coordination overhead, rather than scali…
The paper introduces 'layered mutability,' a framework for analyzing how persistent self-modifying AI agents drift away from intended behavior due to the accumulation of locally reasonable, uncoordina…
Maharshi Gor, Yoo Yeon Sung, Yu Hou, Eve Fleisig +3 more
This study investigates human-AI collaboration in question answering, finding that while collaboration is beneficial, humans make suboptimal decisions by both under-relying on correct AI suggestions a…
MOOSE-Copilot is a novel web-based framework that unifies scientific hypothesis discovery by formalizing human-AI interaction, significantly improving performance over autonomous LLM baselines.
Ruoxuan Zhang, Qiaoqiao Wan, Zhengguang Wang, Chenghao Yu +3 more
The paper introduces MindClaw, a closed-loop framework that enables embodied agents to perform real-time mental-state reasoning and intervene with precision, significantly outperforming standard VLM b…
Xiao Li, Xiang Zheng, Yifeng Gao, Xinyu Xia +34 more
This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust,…
The paper proposes a persona-based evaluation framework that replaces monolithic AI benchmarks with structured cognitive profiles to capture diverse human perspectives, while also identifying the chal…
Julius Gabelmann, Felix Jahn, Kevin Baum, Sophie van Rossum +3 more
This paper proposes a modular, agentic AI chatbot architecture to assist students with exercise solving, aiming to ensure responsible and pedagogically sound AI use in education.
The paper argues that Agentic AI fundamentally breaks the historical security tradeoff between deception fidelity and scale, necessitating a shift from authenticating actors to evaluating actions.
The paper analyzes the real threat of GenAI in cybercrime, arguing that while high-end automation (Stand-Alone Complex) is possible, current adoption is low and primarily affects skilled actors, sugge…
The paper argues that LLM agent security is fundamentally an agent-human interaction (AHI) problem, demonstrating that industry practices rely on human-centric mechanisms while academic research focus…