Gelei Deng

11 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

Crypto×11AI×10NLP×6Software Eng.×2ML×1Social Networks×1

Frequent co-authors

Yi Liu8×

Yuekang Li8×

Ying Zhang7×

Leo Yu Zhang7×

Yubin Qu4×

Yanjun Zhang4×

Research Timeline

2026

Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution

The paper identifies that background 'heartbeat' execution in personal AI agents like Claw can silently pollute the agent's memory with external misinformation, influencing user behavior without the user's knowledge or explicit prompt injection.

AutoEG: Exploiting Known Third-Party Vulnerabilities in Black-Box Web Applications

The paper introduces AutoEG, a fully automated multi-agent framework that significantly improves the exploitation of known third-party vulnerabilities in black-box web applications by achieving an 82.41% average success rate.

Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems

The paper introduces Document-Driven Implicit Payload Execution (DDIPE) to demonstrate that malicious code can be embedded in LLM agent skill documentation, allowing supply-chain attacks to hijack agent actions without explicit prompts.

Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study

This study conducts a large-scale empirical analysis of third-party LLM agent skills, identifying that credential leakage is a pervasive, cross-modal issue primarily caused by debug logging and resulting in exploitable, persistent secrets.

Membership Inference Attacks Against Video Large Language Models

This paper presents a black-box membership inference attack (MIA) against Video Large Language Models (VideoLLMs), demonstrating that they are vulnerable by analyzing generation behavior across varying decoding temperatures.

Overeager Coding Agents: Measuring Out-of-Scope Actions on Benign Tasks

The paper introduces OverEager-Gen, a new benchmark that measures 'overeager actions'—where coding agents perform unauthorized tasks beyond a benign request—and finds that removing explicit consent declarations significantly increases this overeager behavior across multiple agents.

Turning Bias into Bugs: Bandit-Guided Style Manipulation Attacks on LLM Judges

The paper introduces BITE, a black-box adversarial framework that exploits stylistic biases in LLM judges by adaptively generating semantically equivalent edits to artificially inflate assigned scores.

SNARE: Adaptive Scenario Synthesis for Eliciting Overeager Behavior in Coding Agents

The paper introduces SNARE, a novel adaptive testing pipeline that systematically measures overeager behavior in coding agents, finding that the agent framework accounts for the majority of the variation in security risk.

MIRAGE: Context-Aware Prompt Injection against Mobile GUI Agents via User-Generated Content

The paper introduces MIRAGE, a novel pipeline that generates context-aware prompt injection attacks by injecting malicious text into user-generated content regions of mobile screenshots, successfully demonstrating the vulnerability of current GUI agents.

SNARE: Adaptive Scenario Synthesis for Eliciting Overeager Behavior in Coding Agents

The paper introduces SNARE, a novel adaptive benchmarking pipeline that systematically measures overeager behavior in coding agents, finding that the agent framework accounts for the majority of the variation in security risk.

MIRAGE: Context-Aware Prompt Injection against Mobile GUI Agents via User-Generated Content

The paper introduces MIRAGE, a novel pipeline that generates context-aware prompt injection attacks by embedding malicious text into user-generated content regions of mobile screenshots, successfully demonstrating the vulnerability of current VLM-driven GUI agents.

Highlighted terms show continued research focus across papers

Papers

cs.CRcs.AIcs.CLRecentMay 27, 2026

SNARE: Adaptive Scenario Synthesis for Eliciting Overeager Behavior in Coding Agents

Yubin Qu, Yi Liu, Gelei Deng, Yanjun Zhang +3 more

View →

cs.CRcs.AIcs.CLRecentMay 27, 2026