Zhen Zhang
6 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
Reflect-Guard enhances LLM safety classifiers by integrating logical self-reflection, significantly improving detection of sophisticated adversarial jailbreak prompts.
The paper introduces OmniVerifier-M1, a multimodal meta-verifier that uses symbolic outputs and decoupled reinforcement learning to provide robust, fine-grained verification and error localization for large multimodal models.
The paper evaluates LLM reasoning on Boolean satisfiability (SAT) problems, concluding that conventional metrics are misleading and proposing a paired-formula protocol with Accurate Differentiation Rate (ADR) for a more robust assessment.
ExpGraph is a model-agnostic framework that uses a self-evolving experience graph to enable LLM agents to reuse past successful strategies and failure lessons, significantly improving performance across diverse tasks.
ElasticMem introduces a novel framework that treats memory as an elastic latent resource, allowing LLM agents to adaptively manage and inject variable-budget memories for improved performance in long-term reasoning tasks.
GIRL-DETR introduces Gradient-Isolated Reinforcement Learning to enhance temporal localization in lightweight Video Moment Retrieval models, achieving high accuracy by decoupling feature representation from metric optimization.
Papers
GIRL-DETR: Gradient-Isolated Reinforcement Learning for Video Moment Retrieval
Shihang Zhang, Mingjin Kuai, Ye Wei, Zhen Zhang +1 more
GIRL-DETR introduces Gradient-Isolated Reinforcement Learning to enhance temporal localization in lightweight Video Moment Retrieval models, achieving high accuracy by decoupling feature representatio…