Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Wei Zhang

Wei Zhang

24 indexed papers

Recent (6 mo)
24
With code
0
Influential cites
0
Benchmarked
0

Publications per year

24
26

Top categories

AI×16Crypto×16ML×4NLP×3Software Eng.×2Info Retrieval×1Prog. Lang.×1Sound×1

Frequent co-authors

Tianwei Zhang9×
Gelei Deng3×
Yiwei Zhang2×
Fengwei Zhang2×
Jiawen Zhang2×
Fuwei Zhang1×

Research Timeline

2026
MirageBackdoor: A Stealthy Attack that Induces Think-Well-Answer-Wrong Reasoning

MirageBackdoor introduces a novel, highly stealthy backdoor attack that forces Large Language Models to generate correct reasoning steps (Think Well) but output an incorrect final answer (Answer Wrong), bypassing existing detection methods.

Efficient Fuzzy Private Set Intersection from Secret-shared OPRF

The paper proposes efficient Fuzzy Private Set Intersection (FPSI) protocols for various $L_p$ distance metrics by leveraging symmetric-key operations, achieving linear complexity and significantly outperforming existing state-of-the-art methods.

Hijacking Large Audio-Language Models via Context-Agnostic and Imperceptible Auditory Prompt Injection

The paper introduces AudioHijack, a framework that successfully demonstrates context-agnostic and imperceptible auditory prompt injection attacks, showing that commercial Large Audio-Language Models can be hijacked with high success rates.

SkCC: Portable and Secure Skill Compilation for Cross-Framework LLM Agents

SkCC is a compiler that enables portable and secure development of LLM agent skills by decoupling skill semantics from framework-specific formatting, significantly improving reliability and security.

Information Theoretic Adversarial Training of Large Language Models

The paper proposes WARDEN, a distributionally robust adversarial training framework that significantly reduces LLM vulnerability to adversarial attacks by dynamically reweighting hard adversarial examples within a divergence ball.

LeakDojo: Decoding the Leakage Threats of RAG Systems

The paper introduces LeakDojo, a framework that systematically evaluates RAG leakage risks, finding that stronger LLM instruction-following and query generation are major independent contributors to data leakage.

Mitigating Many-shot Jailbreak Attacks with One Single Demonstration

The paper proposes mitigating the progressive degradation of safety in language models caused by many-shot jailbreak attacks by appending a single, fixed safety demonstration at inference time.

Capacitive Touchscreens at Risk: A Practical Side-Channel Attack on Smartphones via Electromagnetic Emanations

The paper introduces TESLA, a novel, contactless electromagnetic (EM) side-channel attack that exploits inherent EM emanations from capacitive touchscreens to extract highly sensitive user data like PIN codes and keystrokes.

Deep-Research Agents Can Be Poisoned via User-Generated Content

The paper demonstrates that deep-research agents are vulnerable to poisoning attacks where an adversary can inject malicious content into a single, frequently retrieved user-generated page to compromise the agent's output across multiple related queries.

Turning Bias into Bugs: Bandit-Guided Style Manipulation Attacks on LLM Judges

The paper introduces BITE, a black-box adversarial framework that exploits stylistic biases in LLM judges by adaptively generating semantically equivalent edits to artificially inflate assigned scores.

MemMorph: Tool Hijacking in LLM Agents via Memory Poisoning

MemMorph introduces a novel memory poisoning attack that biases LLM agent tool selection by injecting crafted records into the agent's long-term memory, achieving high success rates even against modern defenses.

Shielded but Lightweight: Building Practical Confidential Containers with ARM CCA

The paper proposes Fasco, a lightweight confidential container runtime utilizing ARM CCA to significantly reduce startup latency and resource overhead compared to existing microVM-based confidential container architectures.

Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents

The paper distinguishes between a model's ability to generate useful updates for external agent components (harness-updating) and its ability to benefit from those updates (harness-benefit), finding that updating capabilities are surprisingly uniform while benefit is maximized in mid-tier models.

Agora: Toward Autonomous Bug Detection in Production-Level Consensus Protocols with LLM Agents

The paper introduces Agora, a domain-aware multi-agent framework that successfully detects deep, previously unknown logic bugs in complex consensus protocols, outperforming existing LLM-based analysis methods.

CRITIC-R1: Learning Structured Critics for Retrieval-Augmented Generation

CRITIC-R1 introduces a structured critic framework that treats RAG critique as an explicit error diagnosis problem using reinforcement learning, significantly improving answer quality over strong RAG baselines.

MINDGAMES: A Live Arena for Evaluating Social and Strategic Reasoning in Multi-Agent LLMs

The paper introduces Mindgames, a comprehensive multi-game arena for evaluating LLM agents' sustained social and strategic reasoning, demonstrating that current evaluations are limited by structural scaffolding and error-survival confounds.

PassNet: Scaling Large Language Models for Graph Compiler Pass Generation

The paper introduces PassNet, a large-scale ecosystem for generating compiler passes using LLMs, demonstrating that LLMs can significantly accelerate graph compilation for long-tail workloads, suggesting that consistency is the primary bottleneck.

Inverse Reinforcement Learning without an Optimal Demonstrator: A Feasible Reward Set Approach

The paper proposes a feasible-reward-set framework to perform Inverse Reinforcement Learning (IRL) when data comes from multiple imperfect demonstrators, providing theoretical guarantees and practical algorithms.

MineExplorer: Evaluating Open-World Exploration of MLLM Agents in Minecraft

The paper introduces MineExplorer, a new benchmark in Minecraft, to evaluate the sustained open-world exploration capabilities of MLLM agents, finding that long-horizon coordination remains a significant challenge.

CORE-Bench: A Comprehensive Benchmark for Code Retrieval in the Era of Agentic Coding

This paper introduces CORE-Bench, a comprehensive benchmark for code retrieval in agentic coding.

Highlighted terms show continued research focus across papers

Papers

cs.IREmpiricalRecentJun 10, 2026

CORE-Bench: A Comprehensive Benchmark for Code Retrieval in the Era of Agentic Coding

Fuwei Zhang, Yanzhao Zhang, Mingxin Li, Dingkun Long +4 more

This paper introduces CORE-Bench, a comprehensive benchmark for code retrieval in agentic coding.

View →
cs.LGcs.AIRecentMay 29, 2026

Inverse Reinforcement Learning without an Optimal Demonstrator: A Feasible Reward Set Approach

Kihyun Kim, Shripad Deshmukh, Nikos Vlassis, Jiawei Zhang

The paper proposes a feasible-reward-set framework to perform Inverse Reinforcement Learning (IRL) when data comes from multiple imperfect demonstrators, providing theoretical guarantees and practical…

View →
cs.CLRecentMay 29, 2026

MineExplorer: Evaluating Open-World Exploration of MLLM Agents in Minecraft

Tianjie Ju, Yueqing Sun, Zheng Wu, Wei Zhang +6 more

The paper introduces MineExplorer, a new benchmark in Minecraft, to evaluate the sustained open-world exploration capabilities of MLLM agents, finding that long-horizon coordination remains a signific…

View →
cs.AIRecentMay 28, 2026

Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents

Minhua Lin, Juncheng Wu, Zijun Wang, Zhan Shi +13 more

The paper distinguishes between a model's ability to generate useful updates for external agent components (harness-updating) and its ability to benefit from those updates (harness-benefit), finding t…

View →
cs.SEcs.AIRecentMay 28, 2026

Agora: Toward Autonomous Bug Detection in Production-Level Consensus Protocols with LLM Agents

Xiang Liu, Sa Song, Zhaowei Zhang, Huiying Lan +5 more

The paper introduces Agora, a domain-aware multi-agent framework that successfully detects deep, previously unknown logic bugs in complex consensus protocols, outperforming existing LLM-based analysis…

View →
cs.CLcs.AIRecentMay 28, 2026

CRITIC-R1: Learning Structured Critics for Retrieval-Augmented Generation

Wenhan Xiao, Ziwei Zhang, Chuanyue Yu, Xingcheng Fu +3 more

CRITIC-R1 introduces a structured critic framework that treats RAG critique as an explicit error diagnosis problem using reinforcement learning, significantly improving answer quality over strong RAG…

View →
cs.AIRecentMay 28, 2026

MINDGAMES: A Live Arena for Evaluating Social and Strategic Reasoning in Multi-Agent LLMs

Kevin Wang, Anna Thöni, Benjamin Kempinski, Bobby Cheng +49 more

The paper introduces Mindgames, a comprehensive multi-game arena for evaluating LLM agents' sustained social and strategic reasoning, demonstrating that current evaluations are limited by structural s…

View →
cs.AIcs.LGcs.PLRecentMay 28, 2026

PassNet: Scaling Large Language Models for Graph Compiler Pass Generation

Yiqun Liu, Yingsheng Wu, Ruqi Yang, Enrong Zheng +10 more

The paper introduces PassNet, a large-scale ecosystem for generating compiler passes using LLMs, demonstrating that LLMs can significantly accelerate graph compilation for long-tail workloads, suggest…

View →
cs.CRRecentMay 25, 2026

Shielded but Lightweight: Building Practical Confidential Containers with ARM CCA

Liantao Song, Yiming Zhang, Fengwei Zhang, Yan Ding +3 more

The paper proposes Fasco, a lightweight confidential container runtime utilizing ARM CCA to significantly reduce startup latency and resource overhead compared to existing microVM-based confidential c…

View →
cs.CRcs.AIcs.LGRecentMay 24, 2026

Turning Bias into Bugs: Bandit-Guided Style Manipulation Attacks on LLM Judges

Xianglin Yang, Bryan Hooi, Gelei Deng, Tianwei Zhang +1 more

The paper introduces BITE, a black-box adversarial framework that exploits stylistic biases in LLM judges by adaptively generating semantically equivalent edits to artificially inflate assigned scores…

View →
cs.CRcs.AIRecentMay 24, 2026

MemMorph: Tool Hijacking in LLM Agents via Memory Poisoning

Xuanye Zhang, Yongsen Zheng, Zhuqin Xu, Kaiyu Zhou +4 more

MemMorph introduces a novel memory poisoning attack that biases LLM agent tool selection by injecting crafted records into the agent's long-term memory, achieving high success rates even against moder…

View →
cs.CRRecentMay 22, 2026

Deep-Research Agents Can Be Poisoned via User-Generated Content

Tingwei Zhang, Harold Triedman, Vitaly Shmatikov

The paper demonstrates that deep-research agents are vulnerable to poisoning attacks where an adversary can inject malicious content into a single, frequently retrieved user-generated page to compromi…

View →
cs.CRRecentMay 14, 2026

Capacitive Touchscreens at Risk: A Practical Side-Channel Attack on Smartphones via Electromagnetic Emanations

Yukun Cheng, Changhai Ou, Shiyu Zhu, Jinyuan Zhang +5 more

The paper introduces TESLA, a novel, contactless electromagnetic (EM) side-channel attack that exploits inherent EM emanations from capacitive touchscreens to extract highly sensitive user data like P…

View →
cs.CRcs.AIRecentMay 8, 2026

Mitigating Many-shot Jailbreak Attacks with One Single Demonstration

Kejia Chen, Jiawen Zhang, Boheng Li, Pengcheng Li +5 more

The paper proposes mitigating the progressive degradation of safety in language models caused by many-shot jailbreak attacks by appending a single, fixed safety demonstration at inference time.

View →
cs.CRcs.AIcs.CLRecentMay 7, 2026

LeakDojo: Decoding the Leakage Threats of RAG Systems

Maosen Zhang, Jianshuo Dong, Boting Lu, Wenyue Li +3 more

The paper introduces LeakDojo, a framework that systematically evaluates RAG leakage risks, finding that stronger LLM instruction-following and query generation are major independent contributors to d…

View →
cs.LGcs.AIcs.CRRecentMay 6, 2026

Information Theoretic Adversarial Training of Large Language Models

Yiwei Zhang, Jeremiah Birrell, Reza Ebrahimi, Rouzbeh Behnia +2 more

The paper proposes WARDEN, a distributionally robust adversarial training framework that significantly reduces LLM vulnerability to adversarial attacks by dynamically reweighting hard adversarial exam…

View →
cs.CRcs.AIRecentMay 5, 2026

SkCC: Portable and Secure Skill Compilation for Cross-Framework LLM Agents

Yipeng Ouyang, Yi Xiao, Yuhao Gu, Xianwei Zhang

SkCC is a compiler that enables portable and secure development of LLM agent skills by decoupling skill semantics from framework-specific formatting, significantly improving reliability and security.

View →
cs.CRRecentApr 16, 2026

Efficient Fuzzy Private Set Intersection from Secret-shared OPRF

Xinpeng Yang, Meng Hao, Chenkai Weng, Robert H. Deng +2 more

The paper proposes efficient Fuzzy Private Set Intersection (FPSI) protocols for various $L_p$ distance metrics by leveraging symmetric-key operations, achieving linear complexity and significantly ou…

View →
cs.CRcs.AIcs.SDRecentApr 16, 2026

Hijacking Large Audio-Language Models via Context-Agnostic and Imperceptible Auditory Prompt Injection

Meng Chen, Kun Wang, Li Lu, Jiaheng Zhang +1 more

The paper introduces AudioHijack, a framework that successfully demonstrates context-agnostic and imperceptible auditory prompt injection attacks, showing that commercial Large Audio-Language Models c…

View →
cs.CRRecentApr 8, 2026

MirageBackdoor: A Stealthy Attack that Induces Think-Well-Answer-Wrong Reasoning

Yizhe Zeng, Wei Zhang, Yunpeng Li, Juxin Xiao +2 more

MirageBackdoor introduces a novel, highly stealthy backdoor attack that forces Large Language Models to generate correct reasoning steps (Think Well) but output an incorrect final answer (Answer Wrong…

View →