Zheng Li
11 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper proposes TriageFuzz, a token-aware fuzzing framework that significantly reduces the number of queries needed to jailbreak LLMs while maintaining high attack success rates.
ClawKeeper is a comprehensive, multi-layered security framework designed to mitigate critical vulnerabilities in autonomous agent runtimes like OpenClaw by enforcing protection across skills, plugins, and system state.
The paper introduces an embedding disruption method to re-activate and strengthen built-in safeguards within LLMs, effectively detecting and defending against sophisticated jailbreak attacks.
The paper introduces Disrupt-and-Rectify Smoothing (DR-Smoothing), a novel two-stage defense mechanism that significantly improves LLM security against jailbreaking attacks by restoring disrupted inputs to a safe, in-distribution form.
This paper systematically measures and explains how sequential model defenses can conflict, finding that 38.9% of ordered defense sequences cause measurable risk exacerbation due to anti-aligned parameter updates in shared layers.
MemMark introduces a state-evolution attribution watermark that embeds owner-controlled signals into latent memory-write decisions, enabling robust provenance tracking for agent memory even when all traditional logs and metadata are lost.
This paper systematically analyzes how different architectural components of Large Vision-Language Models (LVLMs) contribute to hallucination robustness, finding that joint enhancement of visual fidelity and semantic alignment is most effective.
The paper introduces ProductWebGen, a benchmark for evaluating multimodal models' ability to generate consistent, high-fidelity product webpages from images and instructions, finding that separate editing-based workflows outperform unified models in overall webpage instruction following.
The paper introduces Robust Prior Update (RPU), a module that improves the faithfulness of diffusion-based inverse solvers by stabilizing the prior update step, thereby reducing measurement-conditioned hallucination.
The paper proposes a measurement-geometry framework to quantify how well fixed measurement operators can distinguish between images generated by a prior, thereby guiding the design of more trustworthy and informative acquisition protocols.
DFlare introduces a lightweight layer-wise fusion mechanism to overcome the narrow conditioning bottleneck of existing block diffusion methods, enabling the scaling of draft models and achieving superior speculative decoding speedups across multiple LLMs.
Papers
Hallucination-Aware Diffusion Sampling for Inverse Problems via Robust Prior Updates
Pengfei Jin, Yiqi Tian, Kailong Fan, Bingjie Qi +1 more
The paper introduces Robust Prior Update (RPU), a module that improves the faithfulness of diffusion-based inverse solvers by stabilizing the prior update step, thereby reducing measurement-conditione…