Huan Zhang

8 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

Crypto×5Vision×4AI×4ML×2NLP×1

Frequent co-authors

Rui Yang2×

Yuxi Chen2×

Gang Wang2×

Fan Yang2×

Binyan Xu2×

Di Tang2×

Research Timeline

2026

CAPTCHA Solving for Native GUI Agents: Automated Reasoning-Action Data Generation and Self-Corrective Training

The paper introduces ReCAP, a native GUI agent that significantly improves CAPTCHA solving success (from 30% to 80%) by integrating specialized CAPTCHA capabilities into a general-purpose, end-to-end vision-language model.

Client-Verifiable and Efficient Federated Unlearning in Low-Altitude Wireless Networks

The paper proposes VerFU, a client-verifiable federated unlearning framework for low-altitude wireless networks that allows devices to ensure the server accurately removes their historical data contributions without revealing the original data.

Beyond Nodes vs. Edges: A Multi-View Fusion Framework for Provenance-Based Intrusion Detection

The paper proposes PROVFUSION, a multi-view fusion framework that integrates anomaly signals from attribute, structure, and causality views to overcome the limitations of single node- or edge-centric provenance-based intrusion detection.

Trapping Attacker in Dilemma: Examining Internal Correlations and External Influences of Trigger for Defending GNN Backdoors

The paper proposes PRAETORIAN, a novel defense mechanism for Graph Neural Networks (GNNs) that targets the intrinsic structural requirements of backdoor attacks, significantly reducing the attack success rate while maintaining high clean accuracy.

CoT-Guard: Small Models for Strong Monitoring

The paper introduces CoT-Guard, a small, cost-effective 4B-parameter model that significantly outperforms large, expensive monitors like GPT-5 in detecting hidden objectives in code generation tasks.

GloResNet: A lightweight 3D CNN with global topological features for preterm brain injury prediction

The paper proposes GloResNet, a lightweight 3D CNN that effectively predicts brain injury in preterm infants using T2-weighted MRI, achieving an average accuracy of 75.18%.

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents

The paper introduces OpenWebRL, an open framework that enables training visual web agents using online multi-turn Reinforcement Learning directly on live websites, achieving state-of-the-art performance on challenging web benchmarks.

PAR3D: A Unified 3D-MLLM with Part-Aware Representation for Scene Understanding

The paper introduces PAR3D, a unified part-aware 3D-MLLM framework, to enhance 3D scene understanding by enabling models to reason about and ground both whole objects and their fine-grained parts.

Highlighted terms show continued research focus across papers

Papers

cs.CVRecentJun 4, 2026

PAR3D: A Unified 3D-MLLM with Part-Aware Representation for Scene Understanding

Shaohui Dai, Yansong Qu, You Shen, Shengchuan Zhang +1 more

The paper introduces PAR3D, a unified part-aware 3D-MLLM framework, to enhance 3D scene understanding by enabling models to reason about and ground both whole objects and their fine-grained parts.

View →

cs.CVRecentJun 1, 2026