ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2606.02300· 20 results

cs.AIcs.CLRecentMay 28, 2026

Teaching Values to Machines: Simulating Human-Like Behavior in LLMs

Asaf Yehudai, Naama Rozen, Ariel Gera

The paper successfully demonstrates that Large Language Models (LLMs) can be induced to adopt coherent, human-like value structures, showing strong alignment with human psychological patterns.

View →
cs.AIcs.CLRecentMay 27, 2026

Adopt $\neq$ Adapt: Longitudinal Analyses of LLM Conversations in the Wild

Rebecca M. M. Hicke, Kiran Tomlinson

Analyzing longitudinal data from 12,000 Copilot users, the paper finds that individual user habits regarding LLM interaction are highly sticky and difficult to change, and that existing datasets may o…

View →
cs.CLRecentMay 29, 2026

Preference-Aware Rubric Learning for Personalized Evaluation

Yilun Qiu, Xiaoyan Zhao, Yang Zhang, Yuxin Chen +6 more

The paper introduces PARL, a framework that learns personalized evaluation rubrics directly from raw user interaction histories to accurately assess how well LLM outputs align with subjective, user-sp…

View →
cs.CLcs.AIRecentMay 27, 2026

ChildEval: When large language models meet children's personalities

Yanyan Luo, Xue Han, Chunxu Zhao, Ruiqiao Bai +4 more

The paper introduces ChildEval, a large-scale benchmark designed to systematically evaluate how well large language models can infer and follow complex, child-specific preferences during long-context…

View →
cs.CRcs.LGRecentMay 12, 2026

PrivacySIM: Evaluating LLM Simulation of User Privacy Behavior

James Flemings, Murali Annavaram

The paper introduces PrivacySIM, an evaluation suite that benchmarks how well LLMs can simulate individual user privacy decisions based on persona attributes, finding that while conditioning improves…

View →
cs.AIRecentJun 1, 2026

MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation

Wenhao Wang, Peizhi Niu, Gongyi Zou, Xiyuan Yang +8 more

The paper introduces MCP-Persona, a novel benchmark designed to evaluate LLM agents' performance on real-world, personalized applications using the Model Context Protocol (MCP), revealing that current…

View →
cs.AIcs.CRRecentApr 13, 2026

Mobile GUI Agent Privacy Personalization with Trajectory Induced Preference Optimization

Zhixin Lin, Jungang Li, Dongliang Xu, Shidong Pan +4 more

The paper proposes Trajectory Induced Preference Optimization (TIPO) to improve mobile GUI agent personalization by explicitly modeling and optimizing for privacy-related behavioral differences in exe…

View →
cs.CLRecentMay 30, 2026

From Empathy to Personalized Empathy: Adapting Empathetic Strategies to Individual Users

Wuqiang Zheng, Chengbing Wang, Yilin Yang, Junyi Cheng +5 more

This paper introduces personalized empathy, a capability for LLMs to adapt empathetic strategies based on individual user history, and proposes PereGRM, a reward modeling framework that significantly…

View →
cs.IRcs.AIRecentMay 27, 2026

Toward User Preference Alignment in LLM Recommendation via Explicit Context Feedback

Weizhi Zhang, Wooseong Yang, Yuxin Cui, Zhaohui Guo +8 more

The paper advocates for integrating explicit contextual feedback (like reviews and comments) into LLM-based recommender systems to achieve more personalized, transparent, and semantically aligned reco…

View →
cs.CRRecentMay 7, 2026

Profiling for Pennies: Unveiling the Privacy Iceberg of LLM Agents

Jiahao Chen, Qi Zhang, Ruixiao Lin, Chunyi Zhou +6 more

The paper introduces the PrivacyIceberg framework to systematically categorize and empirically demonstrate the high risk of automated, deep personal profiling using LLM agents, revealing a significant…

View →
cs.IRcs.AIcs.CYRecentMay 27, 2026

Whose Name Comes Up? III: Persona Prompting Effects in LLM-Based Scholar Recommendation

Annabella Sánchez-Guzmán, Lukas Eberhard, Denis Helic, Lisette Espín-Noboa

The paper proposes a comprehensive benchmark to systematically audit how varying persona prompts and model choices affect the technical quality and social representativeness of scholar recommendations…

View →
cs.CLcs.AIcs.HCRecentMay 28, 2026

EUDAIMONIA: Evaluating Undesirable Dynamics in AI

Jun Rui Huang, Wang Bill Zhu, Ziyi Liu, Nathanael Fast +2 more

The paper introduces EUDAIMONIA, a new framework and benchmark for evaluating how well LLMs align with user welfare in social interactions, finding that even state-of-the-art models frequently violate…

View →
cs.AIRecentMay 29, 2026

Learning to Adapt: Self-Improving Web Agent via Cognitive-Aware Exploration

Weile Chen, Bingchen Miao, Qifan Yu, Wendong Bu +5 more

The paper proposes SCALE, a self-improving web agent framework that uses adversarial roles and graph exploration to autonomously discover agent limitations and enhance adaptability in complex web envi…

View →
cs.CLcs.AIcs.LGRecentMay 30, 2026

On the Limits of LLM Adaptability: Impact of Model-Internalized Priors on Annotation Task Performance

Etienne Casanova, Rafal Kocielnik, R. Michael Alvarez

The paper demonstrates that LLM performance in zero-shot annotation is significantly limited by the alignment between the model's internal understanding and the task definition, showing that prompt-ba…

View →
cs.CLRecentJun 1, 2026

CRAB-Bench: Evaluating LLM Agents under Complex Task Dependencies and Human-aligned User Simulation

Danqing Wang, Akshay Sivaraman, Lei Li

The paper introduces CRAB-Bench and RUSE, a rigorous evaluation framework that tests LLM agents on complex, interdependent tasks with realistic human user interactions, revealing significant performan…

View →
cs.AIcs.LGRecentMay 28, 2026

When and How Human Curation Backfires: Preference Alignment under Multi-Model Self-Consuming Loop

Yang Zhang, Xiukun Wei, Xueru Zhang

This paper analyzes multi-model self-consuming training, showing that while human curation helps individual models, cross-model interactions can degrade long-term alignment by dampening or inverting t…

View →
cs.HCcs.AIcs.CLRecentMay 28, 2026

LLUMI: Improving LLM Writing Assistance for Mental Health Support with Online Community Feedback

Jiwon Kim, Maya Ajit, Sherry Gong, Soorya Ram Shimgekar +3 more

The paper introduces LLUMI, an open-source framework that improves LLM writing assistance for mental health support using community feedback, demonstrating comparable performance to proprietary models…

View →
cs.CLcs.AIRecentMay 29, 2026

Isolating LLM Lexical Bias: A Curation-Free Triangulated Metric for Preference-Stage Learning

Xiaoyang Ming, Jose Hernandez, Thomas Stephan Juzek

The paper introduces the Triangulated Preference Shift score, an automated, curation-free metric to quantify systematic lexical biases introduced into Large Language Models during the preference-learn…

View →
cs.CLcs.AIcs.CRRecentMar 31, 2026

Can LLMs Infer Conversational Agent Users' Personality Traits from Chat History?

Derya Cögendez, Verena Zimmermann, Noé Zufferey

This study quantifies the privacy risk of inferring sensitive personality traits from user interactions with LLM-based conversational agents, demonstrating that machine learning models can accurately…

View →
cs.CLRecentMay 30, 2026

IDEAFix: Evaluation Framework for Creative Defixation Prompting in LLMs

F. Carichon, S. Sharma, M. Girard, R. Rampa +1 more

The paper introduces IDEAFix, a systematic evaluation framework designed to analyze how structured prompting and task design influence the divergent thinking and originality of idea generation in LLMs…

View →