Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Han Wang

Han Wang

24 indexed papers

Recent (6 mo)
24
With code
0
Influential cites
0
Benchmarked
0

Publications per year

24
26

Top categories

AI×14Crypto×11NLP×9ML×5Vision×4Multiagent×3Networking×1Robotics×1

Frequent co-authors

Zihan Wang9×
Minglai Yang3×
Xiaohan Wang3×
Yihan Wang2×
Yuhan Wang2×
Sihan Wang2×

Research Timeline

2026
FP-Agent: Fingerprinting AI Browsing Agents

The paper introduces FP-Agent, a classifier that demonstrates that while browser fingerprints are poor discriminators for AI browsing agents, behavioral fingerprints (like typing and scrolling patterns) are highly effective for distinguishing these agents from humans and from each other.

CoT-Guard: Small Models for Strong Monitoring

The paper introduces CoT-Guard, a small, cost-effective 4B-parameter model that significantly outperforms large, expensive monitors like GPT-5 in detecting hidden objectives in code generation tasks.

Extracting Training Data from Diffusion Language Models via Infilling

The paper introduces 'infilling extraction' to accurately model training data memorization in Diffusion Language Models (DLMs), finding that bidirectional masking significantly increases the extractability of verbatim training data compared to traditional prefix-only methods.

On the Hidden Costs of Counterfactual Knowledge Training in LLM Unlearning

This paper analyzes the limitations of Counterfactual Knowledge Training (CFT) for LLM unlearning, identifying knowledge conflict and hallucination spillover as major pitfalls that hinder its effectiveness.

ZipRL: Adaptive Multi-Turn Context Compression with Hindsight Response Replay

ZipRL introduces an adaptive context compression framework that significantly improves the performance and efficiency of LLMs in complex, multi-turn agent tasks by combining multi-granularity compression with Hindsight Response Replay.

Fine-Tuned LLM as a Complementary Predictor Improving Ads System

The paper introduces a novel paradigm where a fine-tuned LLM acts as an ancillary predictor to forecast likely advertisers, significantly improving ad recommendation systems by augmenting candidate generation and providing priors for downstream ranking.

Demystifying Data Organization for Enhanced LLM Training

This paper proposes four guidelines and two novel data ordering methods (STR and SAW) to systematically optimize data organization, significantly enhancing the stability and performance of LLM training.

Planning with the Views via Scene Self-Exploration

The paper addresses the challenge of multi-turn view planning for VLMs by proposing an iterative framework that uses self-exploration and view graph distillation, significantly improving planning performance over state-of-the-art models.

BAGEN: Are LLM Agents Budget-Aware?

This paper introduces the concept of Budget-Aware Agents (BAGEN), showing that current LLM agents often fail to manage resources proactively, and proposes that incorporating early stop and interval estimation significantly improves efficiency.

Seeing Before Agreeing: Aligning Multi-Agent Consensus with Visual Evidence

The paper proposes EAGLE, a novel evidence-aligned multi-agent framework, demonstrating that requiring shared visual evidence among agents is crucial for achieving reliable and trustworthy consensus in multimodal Visual Question Answering (VQA).

Healthcare Mechanisms from Policy-as-Code Search under Strategic Provider Response

The paper models healthcare mechanism design as program synthesis, demonstrating that an optimized, mixed-objective program can eliminate up-coding and reduce patient rejection while maintaining financial viability.

Are Full Rollouts Necessary for On-Policy Distillation?

This paper proposes two horizon-control strategies, Progressive OPD (POPD) and Truncated OPD (TOPD), demonstrating that full rollouts are often unnecessary for On-Policy Distillation, leading to significant improvements in training efficiency.

Triaging Threats to Specialized Guardrails

The paper introduces RouteGuard, a router-expert framework, to improve the robustness and generalization of safety guardrails by specializing threat detection across multiple distinct unsafe categories.

Triaging Threats to Specialized Guardrails

The paper introduces RouteGuard, a router-expert framework, to improve the robustness and generalization of safety guardrails by specializing threat detection across multiple unsafe categories.

Dr. DocBench: A Comprehensive Benchmark for Expert-Level and Difficult Document Parsing

The paper introduces Dr. DocBench, a difficulty-aware, comprehensive benchmark designed to rigorously test expert-level and challenging document parsing capabilities for VLMs, demonstrating that current state-of-the-art models fail on complex, domain-specific structures.

Connecting the Dots: Benchmarking Reflective Memory in Long-Horizon Dialogue

The paper introduces RefMem-Bench, a new benchmark for measuring reflective memory in long-horizon dialogue, and proposes REMIND, a framework that significantly improves models' ability to synthesize fragmented cues into high-level interpretations.

GloResNet: A lightweight 3D CNN with global topological features for preterm brain injury prediction

The paper proposes GloResNet, a lightweight 3D CNN that effectively predicts brain injury in preterm infants using T2-weighted MRI, achieving an average accuracy of 75.18%.

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

The paper introduces AutoMedBench, a novel workflow-aware benchmark that evaluates autonomous medical-AI agents across a five-stage research process, revealing that agents struggle most with validation and submission.

RadioMaster: Multi-Agent System for Autonomous Radio Signal Generation

The paper introduces RadioMaster, a novel multi-agent system that successfully translates high-level user intents into physically viable, real-world radio signals, significantly outperforming existing methods.

Sequential Data Poisoning in LLM Post-Training

The paper introduces the threat model of sequential data poisoning, demonstrating that multiple, collaborating attackers can exploit compound vulnerabilities in LLM post-training pipelines that are invisible when analyzing individual stages.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.CRRecentJun 3, 2026

Sequential Data Poisoning in LLM Post-Training

Jack Sanderson, Yihan Wang, Xiaoqian Lu, Gautam Kamath +1 more

The paper introduces the threat model of sequential data poisoning, demonstrating that multiple, collaborating attackers can exploit compound vulnerabilities in LLM post-training pipelines that are in…

View →
cs.CVRecentJun 1, 2026

GloResNet: A lightweight 3D CNN with global topological features for preterm brain injury prediction

Boyu Yuan, Jiamiao Lu, Weichuan Zhang, Benqing Wu +4 more

The paper proposes GloResNet, a lightweight 3D CNN that effectively predicts brain injury in preterm infants using T2-weighted MRI, achieving an average accuracy of 75.18%.

View →
cs.AIRecentJun 1, 2026

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

Junqi Liu, Salena Song, Yuhan Wang, Jiawei Mao +11 more

The paper introduces AutoMedBench, a novel workflow-aware benchmark that evaluates autonomous medical-AI agents across a five-stage research process, revealing that agents struggle most with validatio…

View →
cs.MAcs.AIcs.NIRecentJun 1, 2026

RadioMaster: Multi-Agent System for Autonomous Radio Signal Generation

Jiazhen Lei, Tianze Cao, Yuxin Sha, Sihan Wang +4 more

The paper introduces RadioMaster, a novel multi-agent system that successfully translates high-level user intents into physically viable, real-world radio signals, significantly outperforming existing…

View →
cs.CLcs.AIcs.CVRecentMay 31, 2026

Dr. DocBench: A Comprehensive Benchmark for Expert-Level and Difficult Document Parsing

Minglai Yang, Xinyan Velocity Yu, Pengyuan Li, Xinyu Guo +21 more

The paper introduces Dr. DocBench, a difficulty-aware, comprehensive benchmark designed to rigorously test expert-level and challenging document parsing capabilities for VLMs, demonstrating that curre…

View →
cs.CLcs.AIRecentMay 31, 2026

Connecting the Dots: Benchmarking Reflective Memory in Long-Horizon Dialogue

Jingjie Lin, Bingbing Wang, Zihan Wang, Zhengda Jin +3 more

The paper introduces RefMem-Bench, a new benchmark for measuring reflective memory in long-horizon dialogue, and proposes REMIND, a framework that significantly improves models' ability to synthesize…

View →
cs.LGcs.AIcs.CLRecentMay 29, 2026

BAGEN: Are LLM Agents Budget-Aware?

Yuxiang Lin, Zihan Wang, Mengyang Liu, Yuxuan Shan +8 more

This paper introduces the concept of Budget-Aware Agents (BAGEN), showing that current LLM agents often fail to manage resources proactively, and proposes that incorporating early stop and interval es…

View →
cs.CVcs.AIcs.MARecentMay 29, 2026

Seeing Before Agreeing: Aligning Multi-Agent Consensus with Visual Evidence

Yuhan Wang, Shuochen Chang, Yalin Feng, Dongsheng Ma +7 more

The paper proposes EAGLE, a novel evidence-aligned multi-agent framework, demonstrating that requiring shared visual evidence among agents is crucial for achieving reliable and trustworthy consensus i…

View →
cs.AIcs.MARecentMay 29, 2026

Healthcare Mechanisms from Policy-as-Code Search under Strategic Provider Response

Zihan Wang, Xiang Xu, Hongyuan Zha, Wenhao Li

The paper models healthcare mechanism design as program synthesis, demonstrating that an optimized, mixed-objective program can eliminate up-coding and reduce patient rejection while maintaining finan…

View →
cs.CLRecentMay 29, 2026

Are Full Rollouts Necessary for On-Policy Distillation?

Yaocheng Zhang, Jiajun Chai, Yuqian Fu, Songjun Tu +6 more

This paper proposes two horizon-control strategies, Progressive OPD (POPD) and Truncated OPD (TOPD), demonstrating that full rollouts are often unnecessary for On-Policy Distillation, leading to signi…

View →
cs.CRcs.CLRecentMay 29, 2026

Triaging Threats to Specialized Guardrails

Wenjie Jacky Mo, Xiaofei Wen, Rui Cai, Boyu Zhu +5 more

The paper introduces RouteGuard, a router-expert framework, to improve the robustness and generalization of safety guardrails by specializing threat detection across multiple distinct unsafe categorie…

View →
cs.CRcs.CLRecentMay 29, 2026

Triaging Threats to Specialized Guardrails

Wenjie Jacky Mo, Xiaofei Wen, Rui Cai, Boyu Zhu +5 more

The paper introduces RouteGuard, a router-expert framework, to improve the robustness and generalization of safety guardrails by specializing threat detection across multiple unsafe categories.

View →
cs.AIcs.CLRecentMay 28, 2026

Demystifying Data Organization for Enhanced LLM Training

Yalun Dai, Yangyu Huang, Tongshen Yang, Yonghan Wang +7 more

This paper proposes four guidelines and two novel data ordering methods (STR and SAW) to systematically optimize data organization, significantly enhancing the stability and performance of LLM trainin…

View →
cs.AIcs.CVcs.RORecentMay 28, 2026

Planning with the Views via Scene Self-Exploration

Kangrui Wang, Linjie Li, Zhengyuan Yang, Shiqi Chen +6 more

The paper addresses the challenge of multi-turn view planning for VLMs by proposing an iterative framework that uses self-exploration and view graph distillation, significantly improving planning perf…

View →
cs.AIRecentMay 27, 2026

ZipRL: Adaptive Multi-Turn Context Compression with Hindsight Response Replay

Zhexin Hu, Li Wang, Xiaohan Wang, Jiajun Chai +3 more

ZipRL introduces an adaptive context compression framework that significantly improves the performance and efficiency of LLMs in complex, multi-turn agent tasks by combining multi-granularity compress…

View →
cs.IRcs.AIRecentMay 27, 2026

Fine-Tuned LLM as a Complementary Predictor Improving Ads System

Hui Yang, Daiwei He, Kevin Jiang, Taejin Park +19 more

The paper introduces a novel paradigm where a fine-tuned LLM acts as an ancillary predictor to forecast likely advertisers, significantly improving ad recommendation systems by augmenting candidate ge…

View →
cs.CLcs.CRRecentMay 26, 2026

On the Hidden Costs of Counterfactual Knowledge Training in LLM Unlearning

Xiaotian Ye, Xiaohan Wang, Mengqi Zhang, Shu Wu

This paper analyzes the limitations of Counterfactual Knowledge Training (CFT) for LLM unlearning, identifying knowledge conflict and hallucination spillover as major pitfalls that hinder its effectiv…

View →
cs.CLcs.AIcs.CRRecentMay 22, 2026

Extracting Training Data from Diffusion Language Models via Infilling

Yihan Wang, N. Asokan

The paper introduces 'infilling extraction' to accurately model training data memorization in Diffusion Language Models (DLMs), finding that bidirectional masking significantly increases the extractab…

View →
cs.CRcs.AIRecentMay 12, 2026

CoT-Guard: Small Models for Strong Monitoring

Nirav Diwan, Han Wang, Berkcan Kapusuzoglu, Ramin Moradi +5 more

The paper introduces CoT-Guard, a small, cost-effective 4B-parameter model that significantly outperforms large, expensive monitors like GPT-5 in detecting hidden objectives in code generation tasks.

View →
cs.CRRecentMay 2, 2026

FP-Agent: Fingerprinting AI Browsing Agents

Ethan Wang, Zubair Shafiq, Yash Vekaria

The paper introduces FP-Agent, a classifier that demonstrates that while browser fingerprints are poor discriminators for AI browsing agents, behavioral fingerprints (like typing and scrolling pattern…

View →