Papers similar to 2605.30685

~ similar to 2605.30685· 20 results

cs.CLRecentMay 29, 2026

RealityTest: How People Probe AI Identity and Whether Models Disclose It

Anna Gausen, Sarenne Wallbridge, Bessie O'Dell, Christopher Summerfield +1 more

RealityTest introduces a large-scale, multimodal, and multilingual benchmark using real-world human data to test how AI systems disclose their identity, finding that context and phrasing are more crit…

View →

cs.AIRecentMay 27, 2026

Trends in AI and Human-AI Interaction in Clinical Trials -- A Hybrid Human-AI Exploration

Sandra Woolley, Tim Collins, Khalid Khattak, Illia Chernomorets +2 more

This study analyzes ClinicalTrials.gov records to track the rising trend of AI in clinical trials and demonstrates that a hybrid human-AI screening approach is viable but requires clearer reporting of…

View →

cs.CLRecentJun 1, 2026

CARTE: A Benchmark for Mapping Language Model Knowledge Across France

Sarah Almeida Carneiro, Christos Xypolopoulos, Xiao Fei, Yang Zhang +1 more

The paper introduces CARTE, a new benchmark designed to test how well large language models understand fine-grained, regionally differentiated knowledge across the 13 metropolitan regions of France, r…

View →

cs.CRcs.CYRecentApr 30, 2026

Tracking Conversations: Measuring Content and Identity Exposure on AI Chatbots

Muhammad Jazlan, Ethan Wang, Yash Vekaria, Zubair Shafiq

This paper systematically measured web tracking across 20 popular AI chatbots, finding that a majority share both conversational content and user identity information with third parties.

View →

cs.AIRecentMay 27, 2026

Benchmarking AI for low-resource contexts: Thinking beyond leaderboards

Aakash Pant, Kavya Shah, Apoorv Agnihotri, Sneha Nikam +2 more

The paper critiques current AI benchmarking practices for low-resource settings, arguing that evaluation must shift focus from isolated model performance to the holistic performance of the deployed sy…

View →

cs.CRRecentApr 10, 2026

ChatGPT, is this real? The influence of generative AI on writing style in top-tier cybersecurity papers

Daan Vansteenhuyse

This paper analyzes top-tier cybersecurity papers to find evidence of generative AI's influence, finding a post-2022 increase in AI-associated marker words and a general drift toward higher lexical co…

View →

cs.AIcs.CLRecentMay 27, 2026

Adopt $\neq$ Adapt: Longitudinal Analyses of LLM Conversations in the Wild

Rebecca M. M. Hicke, Kiran Tomlinson

Analyzing longitudinal data from 12,000 Copilot users, the paper finds that individual user habits regarding LLM interaction are highly sticky and difficult to change, and that existing datasets may o…

View →

cs.AIcs.CLRecentMay 28, 2026

Teaching Values to Machines: Simulating Human-Like Behavior in LLMs

Asaf Yehudai, Naama Rozen, Ariel Gera

The paper successfully demonstrates that Large Language Models (LLMs) can be induced to adopt coherent, human-like value structures, showing strong alignment with human psychological patterns.

View →

cs.AIRecentMay 30, 2026

AI Sovereignty as National Learning Capacity: A Human-Centered Learning Mechanics Viewpoint on France, the United States, and China

Kim Phuc Tran

The paper proposes viewing national AI development, specifically in France, as a 'national AI learning system' governed by a controlled balance between information injection and entropy dissipation, a…

View →

cs.CLcs.AIRecentMay 27, 2026

DEPART: DEcomposing PARiTy across Multilingual LLMs

Manan Uppadhyay, Prashant Kodali, Pranjal Chitale, Reshma Ramaprasad +2 more

The paper introduces a diagnostic framework to decompose multilingual LLM performance variance, showing that language identity and model-benchmark interactions are key drivers of performance gaps.

View →

econ.GNcs.CEcs.CVRecentMay 31, 2026

Differing Roles of Leisure and Productivity in GDP - A Machine Learning based comparative analysis of Germany and USA

Achintya Ranjan, Uma Ranjan

This paper uses machine learning to model a country's GDP based on working hours and productivity, demonstrating that the differing relative importance of these two factors between Germany and the USA…

View →

cs.CLRecentMay 29, 2026

Anchoring LLM Gender Bias to Human Baselines: A Cross-Lingual Audit

Jiwoo Choi, Seonwoo Ahn, Tongxin Zhang, Seohyon Jung

The paper audits six LLMs across four languages, finding that their gender stereotyping is significantly wider than human baselines and that cross-lingual translation fundamentally alters the nature o…

View →

cs.HCcs.AIRecentMay 27, 2026

The Decision to Verify: How Warmth and User Characteristics Shape Reliance on Conversational Agents for Information Search

Mert Yazan, Frederik Bungaran Ishak Situmeang, Suzan Verberne

Despite having access to web search, users' reliance on conversational AI for information remains high, driven primarily by pre-existing trust and influenced indirectly by the chatbot's conversational…

View →

cs.AIRecentMay 27, 2026

Practitioner Beliefs and Behaviors in AI-Enhanced Education: DOT Framework Survey Evidence

David Gibson, M. Elizabeth Azukas, Gerald Knezek

This study surveyed higher education practitioners to map their beliefs and behaviors regarding AI integration, finding that while they view AI favorably, institutional barriers and gaps in design-ori…

View →

cs.CLRecentMay 28, 2026

When English Rewrites Local Knowledge: Global Narrative Dominance in Large Language Models

Md Arid Hasan, Ruwad Naswan, Farhan Samir, Sharifa Sultana +1 more

The paper demonstrates that using English prompts causes large language models to prioritize globally dominant narratives over local cultural knowledge, even when local evidence is provided.

View →

cs.CLcs.AIcs.HCRecentMay 28, 2026

EUDAIMONIA: Evaluating Undesirable Dynamics in AI

Jun Rui Huang, Wang Bill Zhu, Ziyi Liu, Nathanael Fast +2 more

The paper introduces EUDAIMONIA, a new framework and benchmark for evaluating how well LLMs align with user welfare in social interactions, finding that even state-of-the-art models frequently violate…

View →

cs.CLcs.AIRecentMay 27, 2026

ChildEval: When large language models meet children's personalities

Yanyan Luo, Xue Han, Chunxu Zhao, Ruiqiao Bai +4 more

The paper introduces ChildEval, a large-scale benchmark designed to systematically evaluate how well large language models can infer and follow complex, child-specific preferences during long-context…

View →

cs.HCcs.AIcs.CRRecentApr 19, 2026

What Security and Privacy Transparency Users Need from Consumer-Facing Generative AI

Jiaxun Cao, Yu Dong, Chunxi Zhan, Rithvik Neti +2 more

The paper investigates how users perceive and utilize security and privacy transparency in consumer-facing generative AI, finding that users rely on proxies like popularity and require actionable, tru…

View →

cs.CLcs.AIcs.CYRecentMay 29, 2026

If LLMs Have Human-Like Attributes, Then So Does Age of Empires II

Adrian de Wynter

The paper argues that purported anthropomorphic attributes of LLMs are not unique to language models but are substrate-dependent, demonstrating this by training a neural network on the game Age of Emp…

View →

cs.CRcs.HCRecentApr 7, 2026

Understanding User Privacy Perceptions of GenAI Smartphones

Ran Jin, Liu Wang, Shidong Pan, Luona Xu +2 more

This study investigates user perceptions of privacy risks associated with GenAI smartphones, finding that users express heightened concerns across the entire data lifecycle and suggest comprehensive,…

View →