Han Wang

24 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×14Crypto×11NLP×9ML×5Vision×4Multiagent×3Networking×1Robotics×1

Frequent co-authors

Zihan Wang9×

Minglai Yang3×

Xiaohan Wang3×

Yihan Wang2×

Yuhan Wang2×

Sihan Wang2×

Research Timeline

2026

FP-Agent: Fingerprinting AI Browsing Agents

The paper introduces FP-Agent, a classifier that demonstrates that while browser fingerprints are poor discriminators for AI browsing agents, behavioral fingerprints (like typing and scrolling patterns) are highly effective for distinguishing these agents from humans and from each other.

CoT-Guard: Small Models for Strong Monitoring

The paper introduces CoT-Guard, a small, cost-effective 4B-parameter model that significantly outperforms large, expensive monitors like GPT-5 in detecting hidden objectives in code generation tasks.

Extracting Training Data from Diffusion Language Models via Infilling

The paper introduces 'infilling extraction' to accurately model training data memorization in Diffusion Language Models (DLMs), finding that bidirectional masking significantly increases the extractability of verbatim training data compared to traditional prefix-only methods.

On the Hidden Costs of Counterfactual Knowledge Training in LLM Unlearning

This paper analyzes the limitations of Counterfactual Knowledge Training (CFT) for LLM unlearning, identifying knowledge conflict and hallucination spillover as major pitfalls that hinder its effectiveness.

ZipRL: Adaptive Multi-Turn Context Compression with Hindsight Response Replay

ZipRL introduces an adaptive context compression framework that significantly improves the performance and efficiency of LLMs in complex, multi-turn agent tasks by combining multi-granularity compression with Hindsight Response Replay.

Fine-Tuned LLM as a Complementary Predictor Improving Ads System

The paper introduces a novel paradigm where a fine-tuned LLM acts as an ancillary predictor to forecast likely advertisers, significantly improving ad recommendation systems by augmenting candidate generation and providing priors for downstream ranking.

Demystifying Data Organization for Enhanced LLM Training

This paper proposes four guidelines and two novel data ordering methods (STR and SAW) to systematically optimize data organization, significantly enhancing the stability and performance of LLM training.

Planning with the Views via Scene Self-Exploration

The paper addresses the challenge of multi-turn view planning for VLMs by proposing an iterative framework that uses self-exploration and view graph distillation, significantly improving planning performance over state-of-the-art models.

BAGEN: Are LLM Agents Budget-Aware?

This paper introduces the concept of Budget-Aware Agents (BAGEN), showing that current LLM agents often fail to manage resources proactively, and proposes that incorporating early stop and interval estimation significantly improves efficiency.

Seeing Before Agreeing: Aligning Multi-Agent Consensus with Visual Evidence

The paper proposes EAGLE, a novel evidence-aligned multi-agent framework, demonstrating that requiring shared visual evidence among agents is crucial for achieving reliable and trustworthy consensus in multimodal Visual Question Answering (VQA).

Healthcare Mechanisms from Policy-as-Code Search under Strategic Provider Response

The paper models healthcare mechanism design as program synthesis, demonstrating that an optimized, mixed-objective program can eliminate up-coding and reduce patient rejection while maintaining financial viability.

Are Full Rollouts Necessary for On-Policy Distillation?

This paper proposes two horizon-control strategies, Progressive OPD (POPD) and Truncated OPD (TOPD), demonstrating that full rollouts are often unnecessary for On-Policy Distillation, leading to significant improvements in training efficiency.

Triaging Threats to Specialized Guardrails

The paper introduces RouteGuard, a router-expert framework, to improve the robustness and generalization of safety guardrails by specializing threat detection across multiple distinct unsafe categories.

Triaging Threats to Specialized Guardrails

The paper introduces RouteGuard, a router-expert framework, to improve the robustness and generalization of safety guardrails by specializing threat detection across multiple unsafe categories.

Dr. DocBench: A Comprehensive Benchmark for Expert-Level and Difficult Document Parsing

The paper introduces Dr. DocBench, a difficulty-aware, comprehensive benchmark designed to rigorously test expert-level and challenging document parsing capabilities for VLMs, demonstrating that current state-of-the-art models fail on complex, domain-specific structures.

Connecting the Dots: Benchmarking Reflective Memory in Long-Horizon Dialogue

The paper introduces RefMem-Bench, a new benchmark for measuring reflective memory in long-horizon dialogue, and proposes REMIND, a framework that significantly improves models' ability to synthesize fragmented cues into high-level interpretations.

GloResNet: A lightweight 3D CNN with global topological features for preterm brain injury prediction

The paper proposes GloResNet, a lightweight 3D CNN that effectively predicts brain injury in preterm infants using T2-weighted MRI, achieving an average accuracy of 75.18%.

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

The paper introduces AutoMedBench, a novel workflow-aware benchmark that evaluates autonomous medical-AI agents across a five-stage research process, revealing that agents struggle most with validation and submission.

RadioMaster: Multi-Agent System for Autonomous Radio Signal Generation

The paper introduces RadioMaster, a novel multi-agent system that successfully translates high-level user intents into physically viable, real-world radio signals, significantly outperforming existing methods.

Sequential Data Poisoning in LLM Post-Training

The paper introduces the threat model of sequential data poisoning, demonstrating that multiple, collaborating attackers can exploit compound vulnerabilities in LLM post-training pipelines that are invisible when analyzing individual stages.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.CRRecentJun 3, 2026

Sequential Data Poisoning in LLM Post-Training

Jack Sanderson, Yihan Wang, Xiaoqian Lu, Gautam Kamath +1 more

View →

cs.CVRecentJun 1, 2026