Papers similar to 2605.29107v2

~ similar to 2605.29107v2· 20 results

cs.CRcs.AIRecentMay 27, 2026

GEO-Bench: Benchmarking Ranking Manipulation in Generative Engine Optimization

Ojas Nimase, Zhe Chen, Gengpei Qi, Yue Zhao +1 more

GEO-Bench introduces a standardized benchmark to compare various ranking manipulation attacks (both black-box and white-box) on generative engines, demonstrating that black-box content rewriting can b…

View →

cs.IRcs.CREmpiricalRecentJul 24, 2026

SIREN (Luring LLMs onto the Rocks): PAIR-Driven Preference Manipulation in Web-RAG Recommenders

Evan Caville, Siamak Layeghy, Billy Sung, Sara Dolnicar +1 more

This paper proposes SIREN, an automated method for manipulating the rankings of web-augmented large language models by iteratively editing retrieved webpages and testing the effect on the model's reco…

View →

cs.CRcs.IRRecentMay 27, 2026

Can It Reach the Generator? Investigating the Survival of Prompt-Injection Attacks in Realistic RAG Settings

Yu Yin, Shuai Wang, Bevan Koopman, Guido Zuccon

This paper re-evaluates prompt-injection attacks in realistic RAG settings, finding that most prior attack methods fail to reach the generator, and that current attacks are easily detectable.

View →

cs.CRRecentApr 4, 2026

AttackEval: A Systematic Empirical Study of Prompt Injection Attack Effectiveness Against Large Language Models

Jackson Wang

AttackEval systematically evaluates the effectiveness of 250 prompt injection prompts across ten attack categories, finding that composite and obfuscation attacks are highly effective against current…

View →

cs.CRcs.IRRecentMar 26, 2026

Unveiling the Resilience of LLM-Enhanced Search Engines against Black-Hat SEO Manipulation

Pei Chen, Geng Hong, Xinyi Wu, Mengying Wu +5 more

This paper systematically analyzes the resilience of LLM-enhanced search engines against black-hat SEO attacks, finding that while they block most traditional attacks, they remain vulnerable to sophis…

View →

cs.LGcs.CRRecentJun 1, 2026

Gate AI: LLM Security Benchmark Evaluation Methodology and Results

Ryle Goehausen, Marcus Sousa

The paper introduces a robust evaluation methodology, Gate AI, to accurately benchmark LLM security detectors by eliminating systematic weaknesses like per-dataset threshold tuning and undisclosed ope…

View →

cs.CRcs.IRRecentMay 19, 2026

BiRD: A Bidirectional Ranking Defense Mechanism for Retrieval Augmented Generation

Chengcai Gao, Zhihong Sun, Xiaochuan Shi, Qiufeng Wang +1 more

The paper proposes BiRD, a bidirectional ranking defense mechanism that enhances the robustness of Retrieval-Augmented Generation (RAG) against adversarial attacks by analyzing the alignment between f…

View →

cs.CRRecentMay 7, 2026

Pop Quiz Attack: Black-box Membership Inference Attacks Against Large Language Models

Zeyuan Chen, Yihan Ma, Xinyue Shen, Michael Backes +1 more

The PopQuiz Attack is a novel black-box membership inference attack that successfully tests whether large language models memorize specific training data by framing the target data as multiple-choice…

View →

cs.CRcs.AIcs.LGRecentMay 12, 2026

GraphIP-Bench: How Hard Is It to Steal a Graph Neural Network, and Can We Stop It?

Kaixiang Zhao, Bolin Shen, Yuyang Dai, Shayok Chakraborty +1 more

The paper introduces GraphIP-Bench, a unified benchmark that demonstrates that stealing Graph Neural Networks (GNNs) is relatively easy, and existing defenses often fail to maintain their integrity af…

View →

cs.CRcs.AIcs.CLRecentMay 18, 2026

Prompt2Fingerprint: Plug-and-Play LLM Fingerprinting via Text-to-Weight Generation

Sixu Chen, Xiang Chen, Hongyao Yu, Jiaxin Hong +4 more

Prompt2Fingerprint (P2F) introduces a novel, scalable framework that injects unique LLM fingerprints by mapping text descriptions directly to low-rank parameter updates, eliminating the need for resou…

View →

cs.CRcs.AIRecentMay 10, 2026

MT-JailBench: A Modular Benchmark for Understanding Multi-Turn Jailbreak Attacks

Xinkai Zhang, Zhipeng Wei, Huanli Gong, Jing Ting Zheng +3 more

The paper introduces MT-JailBench, a modular framework for evaluating multi-turn jailbreaks, demonstrating that controlling experimental components like prompt generation and resource budgets is cruci…

View →

cs.CRcs.AIRecentMar 26, 2026

PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented Generation Systems

Haozhen Wang, Haoyue Liu, Jionghao Zhu, Zhichao Wang +2 more

The paper introduces PIDP-Attack, a novel compound adversarial attack that combines prompt injection with database poisoning to manipulate Retrieval-Augmented Generation (RAG) systems against arbitrar…

View →

cs.CRcs.AIRecentApr 21, 2026

Cyber Defense Benchmark: Agentic Threat Hunting Evaluation for LLMs in SecOps

Alankrit Chona, Igor Kozlov, Ambuj Kumar

The paper introduces a challenging benchmark for LLM agents to perform unsupervised threat hunting on raw Windows event logs, finding that current frontier models perform poorly and are not ready for…

View →

cs.SEEmpiricalRecentJul 8, 2026

Rethinking Code Performance Benchmarks for LLMs

Nhat Minh Le, Yisen Xu, Zhijie Wang, Tse-Hsun +1 more

This paper evaluates the performance of large language models on popular benchmarks and finds that only a small percentage of the performant implementations are significantly faster than canonical sol…

View →

cs.CLcs.HCRecentMay 29, 2026

Translation Analytics for Freelancers II: Benchmarking Local LLMs for Confidential Translation Workflows

Yuri Balashov, Rex VanHorn, Mingxi Xu, Austin Downes

The paper benchmarks local, offline LLMs for confidential translation workflows, demonstrating that while they are viable for privacy-sensitive use, they generally lag behind top commercial NMT system…

View →

cs.CRcs.AIRecentMay 8, 2026

CyBiasBench: Benchmarking Bias in LLM Agents for Cyber-Attack Scenarios

Taein Lim, Seongyong Ju, Munhyeok Kim, Hyunjun Kim +1 more

The paper introduces CyBiasBench, a comprehensive benchmark that quantifies the inherent, agent-specific bias in LLM agents' attack selection patterns in cybersecurity scenarios.

View →

cs.CRcs.LGRecentApr 22, 2026

Auto-ART: Structured Literature Synthesis and Automated Adversarial Robustness Testing

Abhijit Talluri

The paper introduces Auto-ART, a comprehensive open-source framework that provides structured meta-analysis and automated testing for adversarial robustness, revealing significant gaps in current ML s…

View →

cs.CRcs.AIcs.LGRecentJun 4, 2026

SlotGCG: Exploiting the Positional Vulnerability in LLMs for Jailbreak Attacks

Seungwon Jeong, Jiwoo Jeong, Hyeonjin Kim, Yunseok Lee +1 more

The paper introduces SlotGCG, an improved jailbreak attack method that systematically searches for the most vulnerable token insertion positions (slots) within a prompt, significantly boosting attack…

View →

cs.CRRecentMay 11, 2026

Context-Aware Spear Phishing: Generative AI-Enabled Attacks Against Individuals via Public Social Media Data

Elham Pourabbas Vafa, Sayak Saha Roy, Shirin Nilizadeh

The paper demonstrates that generative AI can automate and scale highly personalized, context-aware spear-phishing attacks using only public social media data, resulting in messages that are significa…

View →

cs.AIcs.CRRecentMay 12, 2026

Do Androids Dream of Breaking the Game? Systematically Auditing AI Agent Benchmarks with BenchJack

Hao Wang, Hanchen Li, Qiuyang Mang, Alvin Cheung +2 more

The paper introduces BenchJack, an automated red-teaming system that systematically audits popular AI agent benchmarks, revealing numerous reward-hacking exploits and demonstrating a method to signifi…

View →