Hong Zhang
10 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
STEP introduces a novel, black-box, retraining-free detector that profiles audio samples using dual perturbation branches to detect backdoor attacks by exploiting the characteristic instability of hidden triggers.
This paper introduces TwoHamsters, a new benchmark that rigorously tests Multi-Concept Compositional Unsafety (MCCU) in text-to-image models, demonstrating that current state-of-the-art models and safety defenses are highly vulnerable to subtle, compositionally unsafe prompts.
PragLocker is a novel prompt protection scheme that secures valuable LLM agent prompts against theft and reuse by other proprietary models by making them non-portable.
LegalGraphRAG introduces a multi-agent, hierarchical graph retrieval-augmented generation framework to overcome the limitations of traditional RAG in legal domains, achieving state-of-the-art reliable legal reasoning.
The paper introduces DOMINO, a novel inductive framework that synthesizes domain-specific data for LLMs using only reference examples, significantly improving performance on challenging, implicitly defined domains.
Restructuring the communication topology among robots provides significantly greater performance gains in multi-robot coordination than simply increasing the size of the onboard AI models, given fixed hardware budgets.
The paper introduces OpenClawBench, a large-scale dataset and framework for measuring process-side anomalies in real-world agent execution trajectories, demonstrating that task success does not guarantee operational reliability.
The paper proposes a memory-augmented, traversability-aware framework for outdoor VLN that maintains stable, goal-consistent guidance even when semantic cues are interrupted or unavailable.
The paper introduces Atomic Decomposition and Recombination (ADR), a novel framework that generates genuinely novel and challenging verifiable code tasks, significantly improving the scalability of Reinforcement Learning with Verifiable Rewards (RLVR) for LLMs.
The paper introduces CV-Arena, a large-scale open benchmark for instructional computer vision, demonstrating that professional-grade image editing requires advanced capabilities in physical reasoning and structural control.
Papers
CV-Arena: An Open Benchmark for Instructional Computer Vision Problem Solving with Human-AI Collaborative Preferences
Fangzhou Lin, Peiran Li, Lingyu Xu, Wenjing Chen +11 more
The paper introduces CV-Arena, a large-scale open benchmark for instructional computer vision, demonstrating that professional-grade image editing requires advanced capabilities in physical reasoning…