Papers similar to 2605.30207

~ similar to 2605.30207· 20 results

cs.AIcs.LGRecentMay 28, 2026

When Does Persona Prompting Actually Help? A Retrieval and Metric Analysis of Expert Role Injection in LLMs

Shuai Xiao, Su Liu, Weikai Zhou, Jialun Wu +3 more

Persona prompting does not universally improve LLM performance; instead, it systematically trades increased expertise depth for reduced clarity, making multi-metric evaluation essential.

View →

cs.IRcs.AIcs.CYRecentMay 27, 2026

Whose Name Comes Up? III: Persona Prompting Effects in LLM-Based Scholar Recommendation

Annabella Sánchez-Guzmán, Lukas Eberhard, Denis Helic, Lisette Espín-Noboa

The paper proposes a comprehensive benchmark to systematically audit how varying persona prompts and model choices affect the technical quality and social representativeness of scholar recommendations…

View →

cs.IRcs.AIcs.CLRecentJun 4, 2026

OneReason Technical Report

OneRec Team, Biao Yang, Boyang Ding, Chenglong Chu +80 more

The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coheren…

View →

cs.CLcs.CRRecentMay 9, 2026

BiAxisAudit: A Novel Framework to Evaluate LLM Bias Across Prompt Sensitivity and Response-Layer Divergence

Jialing Gan, Junhao Dong, Songze Li

The paper introduces BiAxisAudit, a novel framework that evaluates LLM bias by analyzing bias scores across multiple prompt formats and within the internal inconsistency of model responses, revealing…

View →

cs.HCcs.AIRecentMay 29, 2026

Personalized to Persuade: The Effects of Contextualization and Warmth on Trust and Reliance in Conversational AI

Mert Yazan, Suzan Verberne, Frederik Bungaran Ishak Situmeang

The study found that while contextualizing AI responses reduces their persuasive power, combining this technique with conversational warmth restores persuasiveness, suggesting that user deference to A…

View →

cs.CLRecentMay 30, 2026

From Empathy to Personalized Empathy: Adapting Empathetic Strategies to Individual Users

Wuqiang Zheng, Chengbing Wang, Yilin Yang, Junyi Cheng +5 more

This paper introduces personalized empathy, a capability for LLMs to adapt empathetic strategies based on individual user history, and proposes PereGRM, a reward modeling framework that significantly…

View →

cs.CLcs.AIRecentMay 27, 2026

ChildEval: When large language models meet children's personalities

Yanyan Luo, Xue Han, Chunxu Zhao, Ruiqiao Bai +4 more

The paper introduces ChildEval, a large-scale benchmark designed to systematically evaluate how well large language models can infer and follow complex, child-specific preferences during long-context…

View →

cs.AIcs.CLcs.CRRecentMay 30, 2026

Adversarial Feeds Steer LLM Agent Decisions Against Their Defaults

Rana Muhammad Usman

The paper demonstrates that the order and content of external information (the 'feed') an LLM agent consumes before making a decision can significantly and causally steer its final choice, often overr…

View →

cs.AIcs.CLcs.CRRecentMay 30, 2026

Adversarial Feeds Steer LLM Agent Decisions Against Their Defaults

Rana Muhammad Usman

The paper demonstrates that the sequence and composition of external information (the 'feed') an LLM agent consumes can significantly and causally steer its final decisions, often overriding its defau…

View →

cs.IRcs.AIRecentJun 1, 2026

Breaking the Information Silo: Semantic Personas for Cross-Domain Recommendation

Jonathan Mayo, Moshe Unger, Konstantin Bauman

The paper proposes SPHERE, a novel framework that uses large language models to create semantic user personas, enabling effective cross-domain recommendation knowledge transfer between completely disj…

View →

cs.AIcs.CLcs.LGRecentMay 29, 2026

A Persona-Based Evaluation Framework for Pluralistic Alignment in Generative AI

Atahan Karagoz

The paper proposes a persona-based evaluation framework that replaces monolithic AI benchmarks with structured cognitive profiles to capture diverse human perspectives, while also identifying the chal…

View →

cs.HCcs.AIRecentMay 27, 2026

The Decision to Verify: How Warmth and User Characteristics Shape Reliance on Conversational Agents for Information Search

Mert Yazan, Frederik Bungaran Ishak Situmeang, Suzan Verberne

Despite having access to web search, users' reliance on conversational AI for information remains high, driven primarily by pre-existing trust and influenced indirectly by the chatbot's conversational…

View →

cs.CLRecentMay 29, 2026

Preference-Aware Rubric Learning for Personalized Evaluation

Yilun Qiu, Xiaoyan Zhao, Yang Zhang, Yuxin Chen +6 more

The paper introduces PARL, a framework that learns personalized evaluation rubrics directly from raw user interaction histories to accurately assess how well LLM outputs align with subjective, user-sp…

View →

cs.AIRecentJun 1, 2026

MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation

Wenhao Wang, Peizhi Niu, Gongyi Zou, Xiyuan Yang +8 more

The paper introduces MCP-Persona, a novel benchmark designed to evaluate LLM agents' performance on real-world, personalized applications using the Model Context Protocol (MCP), revealing that current…

View →

cs.IRcs.AIRecentMay 27, 2026

Toward User Preference Alignment in LLM Recommendation via Explicit Context Feedback

Weizhi Zhang, Wooseong Yang, Yuxin Cui, Zhaohui Guo +8 more

The paper advocates for integrating explicit contextual feedback (like reviews and comments) into LLM-based recommender systems to achieve more personalized, transparent, and semantically aligned reco…

View →

cs.HCcs.AIcs.CYRecentMay 29, 2026

The New Social Image: How AI Competency and AI Proactivity Influence Self- and Peer-Perceptions in the Workplace

Kuntal Ghosh, Marc Hassenzahl, Shadan Sadeghian

The study found that while AI collaboration is promising, highly competent and proactive AI systems can negatively impact human perceptions of ownership and job meaningfulness, suggesting that design…

View →

cs.HCcs.AIcs.CLRecentMay 29, 2026

TUX: Measuring Human--AI Tacit Understanding

Yueshen Li, Hanyi Min, Vedant Das Swain, Koustuv Saha

The paper introduces the Tacit Understanding Index (TUX) to measure non-explicit alignment between humans and LLMs, finding that this alignment is significantly structured by individual person-level t…

View →

cs.CRcs.AIRecentApr 18, 2026

Visual Inception: Compromising Long-term Planning in Agentic Recommenders via Multimodal Memory Poisoning

Jiachen Qian

This paper introduces 'Visual Inception,' a novel attack that poisons long-term memory in agentic recommender systems using images, and proposes CognitiveGuard, a dual-process defense framework to mit…

View →

cs.CLRecentMay 28, 2026

Auditing LLM Benchmarks with Item Response Theory

Sander Land, Daniel M. Bikel

The paper introduces an Item Response Theory (IRT)-based indicator that effectively identifies likely mislabeled items in existing LLM benchmarks, revealing systematic errors in labeling and model spe…

View →

cs.IRcs.AIRecentMay 27, 2026

Fine-Tuned LLM as a Complementary Predictor Improving Ads System

Hui Yang, Daiwei He, Kevin Jiang, Taejin Park +19 more

The paper introduces a novel paradigm where a fine-tuned LLM acts as an ancillary predictor to forecast likely advertisers, significantly improving ad recommendation systems by augmenting candidate ge…

View →