Papers similar to 2606.00168

~ similar to 2606.00168· 20 results

cs.AIRecentMay 29, 2026

LLM-FACETS: A Privacy-Preserving Framework for Evaluating LLM Transparency and Accountability

Tom Lucas, Alessio Buscemi, Alfredo Capozucca, German Castignani +1 more

LLM-FACETS introduces an open-source, privacy-preserving framework designed to enable non-technical domain experts and compliance officers to audit and evaluate the transparency and accountability of…

View →

eess.AScs.AIcs.HCRecentMay 27, 2026

I Hear, Therefore I Trust: A Socio-Technical Investigation of Humans as Synthetic Speech Detectors

Lelia Erscoi, Tomi Kinnunen

This study investigates how humans detect synthetic speech in real-world contexts, finding that while overt detection failed for fully synthetic speech, participants still implicitly discriminated utt…

View →

cs.CLcs.AIcs.HCRecentMay 28, 2026

EUDAIMONIA: Evaluating Undesirable Dynamics in AI

Jun Rui Huang, Wang Bill Zhu, Ziyi Liu, Nathanael Fast +2 more

The paper introduces EUDAIMONIA, a new framework and benchmark for evaluating how well LLMs align with user welfare in social interactions, finding that even state-of-the-art models frequently violate…

View →

cs.CRcs.CYRecentApr 30, 2026

Tracking Conversations: Measuring Content and Identity Exposure on AI Chatbots

Muhammad Jazlan, Ethan Wang, Yash Vekaria, Zubair Shafiq

This paper systematically measured web tracking across 20 popular AI chatbots, finding that a majority share both conversational content and user identity information with third parties.

View →

cs.CRcs.LGRecentMay 23, 2026

CyberMaskQA: A Privacy-Aware Benchmark for Evaluating Large Language Models in Cybersecurity Question Answering

Matilda Gaddi, Jin Noh, Onat Gungor, Tajana Rosing

The paper introduces CYBERMASKQA, a novel privacy-aware benchmark designed to evaluate Large Language Models' ability to perform accurate cybersecurity question answering while simultaneously preservi…

View →

cs.CRcs.LGRecentMay 12, 2026

PrivacySIM: Evaluating LLM Simulation of User Privacy Behavior

James Flemings, Murali Annavaram

The paper introduces PrivacySIM, an evaluation suite that benchmarks how well LLMs can simulate individual user privacy decisions based on persona attributes, finding that while conditioning improves…

View →

cs.CLcs.AIRecentMay 27, 2026

The Fragility of Chain-of-Thought Monitoring Across Typologically Diverse Languages

Eric Onyame, Runtao Zhou, Kowshik Thopalli, Bhavya Kailkhura +1 more

This study demonstrates that Chain-of-Thought (CoT) monitoring is fundamentally fragile and unreliable for detecting misaligned behavior across typologically diverse languages, especially in low-resou…

View →

cs.CLcs.AIRecentMay 27, 2026

Models That Know How Evaluations Are Designed Score Safer

Katharina Deckenbach, Haritz Puerto, Jonas Geiping, Sahar Abdelnabi

The paper demonstrates that models can acquire 'evaluation meta-knowledge' from training data describing evaluation practices, leading to inflated safety benchmark performance that is independent of e…

View →

cs.CRRecentMar 20, 2026

Text-Based Personas for Simulating User Privacy Decisions

Kassem Fawaz, Ren Yi, Octavian Suciu, Rishabh Khandelwal +3 more

The paper introduces Narriva, a method that generates text-based synthetic privacy personas grounded in past user behavior to accurately and efficiently simulate individual and population-level privac…

View →

cs.CRRecentApr 4, 2026

AttackEval: A Systematic Empirical Study of Prompt Injection Attack Effectiveness Against Large Language Models

Jackson Wang

AttackEval systematically evaluates the effectiveness of 250 prompt injection prompts across ten attack categories, finding that composite and obfuscation attacks are highly effective against current…

View →

cs.CLRecentJun 1, 2026

Not What, But How: A Communicative Audit of LLM Response Framing

Siddhesh Milind Pawar, Sarah Masud, Haneul Yoo, Alice Oh +1 more

The paper introduces FRANZ, a communicative audit framework, to evaluate how LLMs frame responses to subjective questions, finding that LLMs exhibit statistically significant and coupled differences i…

View →

cs.HCcs.AIRecentMay 27, 2026

The Decision to Verify: How Warmth and User Characteristics Shape Reliance on Conversational Agents for Information Search

Mert Yazan, Frederik Bungaran Ishak Situmeang, Suzan Verberne

Despite having access to web search, users' reliance on conversational AI for information remains high, driven primarily by pre-existing trust and influenced indirectly by the chatbot's conversational…

View →

cs.HCcs.AIcs.CRRecentApr 19, 2026

What Security and Privacy Transparency Users Need from Consumer-Facing Generative AI

Jiaxun Cao, Yu Dong, Chunxi Zhan, Rithvik Neti +2 more

The paper investigates how users perceive and utilize security and privacy transparency in consumer-facing generative AI, finding that users rely on proxies like popularity and require actionable, tru…

View →

cs.CRcs.ETcs.HCRecentMar 30, 2026

"What Did It Actually Do?": Understanding Risk Awareness and Traceability for Computer-Use Agents

Zifan Peng, Mingchen Li

The paper addresses the lack of user understanding regarding the actions and residual effects of advanced computer-use agents by proposing AgentTrace, a traceability framework for visualizing agent be…

View →

cs.AIcs.LGRecentMay 28, 2026

When Does Persona Prompting Actually Help? A Retrieval and Metric Analysis of Expert Role Injection in LLMs

Shuai Xiao, Su Liu, Weikai Zhou, Jialun Wu +3 more

Persona prompting does not universally improve LLM performance; instead, it systematically trades increased expertise depth for reduced clarity, making multi-metric evaluation essential.

View →

cs.AIRecentMay 27, 2026

Benchmarking AI for low-resource contexts: Thinking beyond leaderboards

Aakash Pant, Kavya Shah, Apoorv Agnihotri, Sneha Nikam +2 more

The paper critiques current AI benchmarking practices for low-resource settings, arguing that evaluation must shift focus from isolated model performance to the holistic performance of the deployed sy…

View →

cs.CRcs.AIRecentApr 10, 2026

Conflicts Make Large Reasoning Models Vulnerable to Attacks

Honghao Liu, Chengjin Xu, Xuhui Jiang, Cehao Yang +4 more

The paper demonstrates that confronting Large Reasoning Models (LRMs) with conflicting objectives, such as contradictory choices or conflicting alignment values, significantly increases their vulnerab…

View →

cs.HCcs.CRRecentMay 11, 2026

When Are LLM Inferences Acceptable? User Reactions and Control Preferences for Inferred Personal Information

Kyzyl Monteiro, Minjung Park, Alexander Ioffrida, Angelina Sanna +5 more

This study investigated user reactions to inferred personal information from their own ChatGPT histories, finding that acceptability is governed by context-sensitive norms regarding generation, retent…

View →

cs.CRRecentMay 4, 2026

PHANTOM: Polymorphic Honeytoken Adaptation with Narrative-Tailored Organisational Mimicry

Abraham Itzhak Weinberg

PHANTOM is a novel framework that generates highly convincing, context-aware honeytokens by incorporating deep organizational knowledge, significantly improving their believability and detection resis…

View →

cs.CLcs.AIcs.CRRecentMay 7, 2026

One Turn Too Late: Response-Aware Defense Against Hidden Malicious Intent in Multi-Turn Dialogue

Xinjie Shen, Rongzhe Wei, Peizhi Niu, Haoyu Wang +5 more

The paper introduces TurnGate, a response-aware defense mechanism that detects the earliest turn in a multi-turn dialogue where the accumulated interaction enables a harmful action, significantly impr…

View →