Papers similar to 2605.04724v1

~ similar to 2605.04724v1· 18 results

cs.CRRecentMay 28, 2026

Protecting On-Device AI Inference: A Systematic Review of Attacks and Defence Mechanisms

Zisis Tsiatsikas, Alexandros Fakis, Georgios Karopoulos, Vasileios Kouliaridis +1 more

This paper provides the first comprehensive review of threats and defenses specifically targeting on-device AI inference, revealing a significant imbalance where certain attack types, like adversarial…

View →

cs.CRcs.AIRecentMay 4, 2026

On the Privacy of LLMs: An Ablation Study

Karima Makhlouf, Lamiaa Basyoni, Syed Khaderi, Gabriel Marquez +3 more

This paper conducts a structured ablation study using a unified threat model to evaluate how various system factors (like model architecture and retrieval configuration) influence different types of p…

View →

cs.CRRecentMay 11, 2026

Context-Aware Spear Phishing: Generative AI-Enabled Attacks Against Individuals via Public Social Media Data

Elham Pourabbas Vafa, Sayak Saha Roy, Shirin Nilizadeh

The paper demonstrates that generative AI can automate and scale highly personalized, context-aware spear-phishing attacks using only public social media data, resulting in messages that are significa…

View →

cs.CRcs.AIcs.DCRecentApr 5, 2026

Automating Cloud Security and Forensics Through a Secure-by-Design Generative AI Framework

Dalal Alharthi, Ivan Roberto Kawaminami Garcia

The paper proposes a secure-by-design Generative AI framework that integrates PromptShield for LLM security and CIAF for structured cloud forensic investigation, significantly improving both robustnes…

View →

cs.CRcs.CLRecentMay 30, 2026

"I Strongly Suspect This Website Is a Scam": Benchmarking PII Leakage and Detection without Defense in Autonomous Web Agents

Soham Roy, Sarthakbrata Halder, Arya Bharaty, Vaibhav Bhaskar +4 more

The paper demonstrates that autonomous web agents are highly susceptible to social-engineering attacks, leaking critical PII even when they internally flag a site as suspicious, necessitating output-l…

View →

cs.CRcs.CLRecentMay 30, 2026

"I Strongly Suspect This Website Is a Scam": Benchmarking PII Leakage and Detection without Defense in Autonomous Web Agents

Soham Roy, Sarthakbrata Halder, Arya Bharaty, Vaibhav Bhaskar +4 more

View →

cs.LGcs.CRRecentJun 1, 2026

Outsmarting the Chameleon: Counterfactual Decoupling for Tactical OOD Shifts in Live Streaming Risk Assessment

Yiran Qiao, Jing Chen, Jiaqi Xu, Yang Liu +2 more

The paper proposes a novel framework, LPCD, that uses latent causal modeling to robustly assess evolving adversarial risks in live streaming by decoupling malicious intent from superficial tactical sh…

View →

cs.CRcs.SDRecentMay 18, 2026

Acoustic Interference: A New Paradigm Weaponizing Acoustic Latent Semantic for Universal Jailbreak against Large Audio Language Models

Yanyun Wang, Yu Huang, Zi Liang, Xixin Wu +1 more

The paper introduces Acoustic Interference Attack (AIA), a novel jailbreak method that bypasses Large Audio Language Model (LALM) safety alignments by manipulating the underlying acoustic latent seman…

View →

cs.CLcs.CRRecentApr 16, 2026

Segment-Level Coherence for Robust Harmful Intent Probing in LLMs

Xuanli He, Bilgehan Sel, Faizan Ali, Jenny Bao +2 more

The paper introduces a robust streaming probing objective that requires multiple evidence tokens to support a prediction, significantly improving the detection of harmful intent in LLMs, especially in…

View →

cs.SDcs.CRRecentMay 2, 2026

MelShield: Robust Mel-Domain Audio Watermarking for Provenance Attribution of AI Generated Synthesized Speech

Yutong Jin, Qi Li, Lingshuang Liu, Jianbing Ni

MelShield is a robust, in-generation audio watermarking framework that embeds identifiable signals into AI-generated speech in the Mel-spectrogram domain for reliable copyright protection and attribut…

View →

cs.SDcs.AIRecentJun 1, 2026

HAIM: Human-AI Music Datasets for AI Music Production Tracking Benchmark

Seonghyeon Go, Yumin Kim

The paper introduces HAIM, a new benchmark dataset designed to move AI music detection beyond simple binary classification by tracking specific stages and types of AI integration in music production.

View →

cs.CRcs.HCRecentMay 22, 2026

When Youth Enter the Algorithmic Wild: Discovering and Understanding Potentially Harmful Teen Videos on Douyin and Kwai

Shaoxuan Zhou, Yafei Sun, Jing Zhang, Xianghang Mi

The paper introduces PHTV-Scout, a novel framework that analyzes Douyin and Kwai data, revealing a high prevalence of potentially harmful teen videos, particularly CSE imagery, and demonstrating that…

View →

cs.CRcs.AIRecentMay 28, 2026

How Reliable Are AI Attackers Against a Fixed Vulnerable Target? A 400-Run Empirical Study of LLM Penetration Testing Consistency

Galip Tolga Erdem

This study empirically measures the consistency and success rate of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation capabilit…

View →

cs.CRcs.AIRecentMay 28, 2026

How Reliable Are AI Attackers Against a Fixed Vulnerable Target? A 400-Run Empirical Study of LLM Penetration Testing Consistency

Galip Tolga Erdem

This study empirically measures the consistency and effectiveness of autonomous LLM penetration testing across multiple services, finding statistically significant differences in exploitation rates am…

View →

cs.CRRecentMay 7, 2026

Profiling for Pennies: Unveiling the Privacy Iceberg of LLM Agents

Jiahao Chen, Qi Zhang, Ruixiao Lin, Chunyi Zhou +6 more

The paper introduces the PrivacyIceberg framework to systematically categorize and empirically demonstrate the high risk of automated, deep personal profiling using LLM agents, revealing a significant…

View →

cs.CRRecentMay 18, 2026

From Detection to Response: A Deep Learning and Retrieval-Augmented Generation Framework for Network Intrusion Mitigation

Md Navid Bin Islam, Sajal Saha, Senior Member

The paper introduces an end-to-end framework that not only detects network intrusions using deep learning but also generates actionable, citation-grounded mitigation reports using a Retrieval-Augmente…

View →

cs.CRcs.AIcs.CLRecentMay 4, 2026

PIIGuard: Mitigating PII Harvesting under Adversarial Sanitization

Mingshuo Liu, Yiwei Zha, Min Chen

PIIGuard introduces a novel webpage-level defense mechanism using optimized hidden HTML fragments to prevent LLM assistants from scraping contact-style PII, achieving high defense success rates while…

View →

cs.SDcs.AIcs.CLRecentMay 28, 2026

Audio Jailbreaks in Large Audio-Language Models: Taxonomy, Attack-Defense Analysis, and Cost-Aware Evaluation

Bo-Han Feng, Yu-Hsuan Li Liang, Chien-Feng Liu, You-Hsuan Chang +1 more

This paper provides a unified taxonomy and controlled empirical evaluation of jailbreak attacks and defenses for Large Audio Language Models (LALMs), demonstrating that safety evaluation must consider…

View →