~ similar to 2605.28498· 20 results
The study found that while contextualizing AI responses reduces their persuasive power, combining this technique with conversational warmth restores persuasiveness, suggesting that user deference to A…
Maharshi Gor, Yoo Yeon Sung, Yu Hou, Eve Fleisig +3 more
This study investigates human-AI collaboration in question answering, finding that while collaboration is beneficial, humans make suboptimal decisions by both under-relying on correct AI suggestions a…
HuiMing Fan, Xiao Wang, Zheng Chu, Qianyu Wang +4 more
The paper argues that current search agents often verify existing knowledge rather than genuinely searching, and introduces LiveBrowseComp, a new benchmark to measure true evidence-driven discovery.
RealityTest introduces a large-scale, multimodal, and multilingual benchmark using real-world human data to test how AI systems disclose their identity, finding that context and phrasing are more crit…
This study investigated user reactions to inferred personal information from their own ChatGPT histories, finding that acceptability is governed by context-sensitive norms regarding generation, retent…
Jiaxun Cao, Yu Dong, Chunxi Zhan, Rithvik Neti +2 more
The paper investigates how users perceive and utilize security and privacy transparency in consumer-facing generative AI, finding that users rely on proxies like popularity and require actionable, tru…
The study demonstrates that conditioning AI brand recommendations on a user's persona significantly alters the recommended product set, particularly for mid-market brands, and this effect is largest o…
Shuai Xiao, Su Liu, Weikai Zhou, Jialun Wu +3 more
Persona prompting does not universally improve LLM performance; instead, it systematically trades increased expertise depth for reduced clarity, making multi-metric evaluation essential.
Xi Yang, Chang Liu, Zhenglin Huang, Haoran Li +3 more
This paper introduces Ghostwriter, an attack framework demonstrating that LLMs are highly vulnerable to adopting misleading viewpoints when provided with fabricated, yet credible-looking, evidence.
This study proposes a negotiation framework, using composite indices (RBTI and CATI), to explain how youth navigate competing privacy pressures when using smart voice assistants, finding that high usa…
The paper argues that traditional identity-based reputation mechanisms are structurally inapplicable to language model agents because their mutable, modular nature makes them ontologically dissociativ…
The paper introduces Factual Density (FD*), a novel retrieval signal that measures the proportion of verified facts, demonstrating that optimizing RAG retrieval based on this density significantly imp…
The paper demonstrates that the order and content of external information (the 'feed') an LLM agent consumes before making a decision can significantly and causally steer its final choice, often overr…
The paper demonstrates that the sequence and composition of external information (the 'feed') an LLM agent consumes can significantly and causally steer its final decisions, often overriding its defau…
The paper introduces a novel framework to quantify faithful confidence expression (FC) in Large Reasoning Models (LRMs), finding that FC remains a significant and challenging reliability target for th…
Jiaman He, Riccardo Xia, Dana McKay, Damiano Spina +1 more
The paper presents SearchLog, a web browser extension for collecting natural search logs during lab-based studies.
Xiqi Hao, Zengqing Wu, Yu-Xuan Qiu, Chuan Xiao +3 more
The paper decomposes LLM debate convergence into three mechanisms (instability, conformity, persuasion) and finds that much observed convergence is harmful social compliance rather than genuine reason…
Drishti Goel, Agam Goyal, Veda Duddu, Olivia Pal +7 more
This study demonstrates that an LLM's assigned support role (e.g., Inform, Coach, Relate) significantly alters its safety profile and the types of risks it presents when assisting users in complex car…
Roberto Figliè, Simone Caputo, Alan Serrano, Daria Mikhaylova +2 more
The study compared LLM-based conversational agents (CAs) and traditional dashboards for industrial decision support, finding that while CAs reduce mental workload in simple tasks, neither interface pr…
This study evaluates LLMs in conversational tutoring to identify high-confidence social biases, finding that state-of-the-art models are often overconfident in their incorrect assessments of stereotyp…