Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Han Zhao

Han Zhao

9 indexed papers

Recent (6 mo)
9
With code
0
Influential cites
0
Benchmarked
0

Publications per year

9
26

Top categories

AI×5Crypto×5NLP×4ML×3Vision×1Robotics×1Networking×1

Frequent co-authors

Jiacheng Liu2×
Xiaohan Zhao2×
Xinyi Shang2×
Jiacheng Cui2×
Zhiqiang Shen2×
Yunhan Zhao2×

Research Timeline

2026
WebPII: Benchmarking Visual PII Detection for Computer-Use Agents

The paper introduces WebPII, a novel, large-scale synthetic benchmark for detecting personally identifiable information (PII) in web screenshots, and demonstrates a model (WebRedact) that significantly improves detection accuracy and speed.

Digital Twin Enabled Simultaneous Learning and Modeling for UAV-assisted Secure Communications with Eavesdropping Attacks

The paper proposes a Digital Twin-enabled Simultaneous Learning and Modeling (DT-SLAM) framework to enhance secure communications in UAV-assisted networks against intelligent eavesdropping attacks, achieving significant gains in secure throughput.

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust, and reliable real-world agents.

ML-Bench&Guard: Policy-Grounded Multilingual Safety Benchmark and Guardrail for Large Language Models

The paper introduces ML-Bench, a policy-grounded multilingual safety benchmark, and ML-Guard, a superior guardrail model that enables culturally and legally aligned safety assessment for LLMs across 14 languages.

PRO-CUA: Process-Reward Optimization for Computer Use Agents

PRO-CUA introduces a process-reward optimization framework that enables efficient, step-level reinforcement learning for training computer use agents by decoupling environment interaction from policy optimization.

LLMSurgeon: Diagnosing Data Mixture of Large Language Models

The paper introduces LLMSurgeon, a framework that estimates the domain-level data mixture of a Large Language Model (LLM) using only generated text, thereby providing a post-hoc method to audit the model's 'digital DNA'.

CanLegalRAGBench: Evaluating Retrieval-Augmented Generation on Canadian Case Law

The paper introduces CanLegalRAGBench, a new Canadian legal QA benchmark, and evaluates RAG systems, finding that while open-source models are competitive, automatic evaluations struggle with nuanced legal retrieval and generation.

RogueMerge: Robust and Unified Attacks against LLM Model Merging

RogueMerge introduces a unified framework to robustly attack LLM model merging by addressing the challenges of autoregressive decoding, unknown merging configurations, and prompt generalization, significantly outperforming prior methods.

Operation-Guided Progressive Human-to-AI Text Transformation Benchmark for Multi-Granularity AI-Text Detection

The paper introduces OpAI-Bench, a novel benchmark designed to study how AI authorship signals evolve and accumulate during the progressive co-editing process between humans and AI.

Highlighted terms show continued research focus across papers

Papers

cs.CLcs.AIcs.LGRecentJun 4, 2026

Operation-Guided Progressive Human-to-AI Text Transformation Benchmark for Multi-Granularity AI-Text Detection

Sondos Mahmoud Bsharat, Jiacheng Liu, Xiaohan Zhao, Tianjun Yao +8 more

The paper introduces OpAI-Bench, a novel benchmark designed to study how AI authorship signals evolve and accumulate during the progressive co-editing process between humans and AI.

View →
cs.CRcs.LGRecentJun 2, 2026

RogueMerge: Robust and Unified Attacks against LLM Model Merging

Jinghuai Zhang, Yetian He, Kunlin Cai, Han Zhao +2 more

RogueMerge introduces a unified framework to robustly attack LLM model merging by addressing the challenges of autoregressive decoding, unknown merging configurations, and prompt generalization, signi…

View →
cs.CLcs.AIcs.LGRecentMay 28, 2026

LLMSurgeon: Diagnosing Data Mixture of Large Language Models

Yaxin Luo, Jiacheng Cui, Xiaohan Zhao, Xinyi Shang +4 more

The paper introduces LLMSurgeon, a framework that estimates the domain-level data mixture of a Large Language Model (LLM) using only generated text, thereby providing a post-hoc method to audit the mo…

View →
cs.CLRecentMay 28, 2026

CanLegalRAGBench: Evaluating Retrieval-Augmented Generation on Canadian Case Law

Ethan Zhao, Maksym Taranukhin, Wei Cui, Moira Aikenhead +1 more

The paper introduces CanLegalRAGBench, a new Canadian legal QA benchmark, and evaluates RAG systems, finding that while open-source models are competitive, automatic evaluations struggle with nuanced…

View →
cs.AIRecentMay 27, 2026

PRO-CUA: Process-Reward Optimization for Computer Use Agents

Yifei He, Rui Yang, Hao Bai, Tong Zhang +1 more

PRO-CUA introduces a process-reward optimization framework that enables efficient, step-level reinforcement learning for training computer use agents by decoupling environment interaction from policy…

View →
cs.CLcs.CRRecentMay 1, 2026

ML-Bench&Guard: Policy-Grounded Multilingual Safety Benchmark and Guardrail for Large Language Models

Yunhan Zhao, Zhaorun Chen, Xingjun Ma, Yu-Gang Jiang +1 more

The paper introduces ML-Bench, a policy-grounded multilingual safety benchmark, and ML-Guard, a superior guardrail model that enables culturally and legally aligned safety assessment for LLMs across 1…

View →
cs.CRcs.AIcs.CVRecentMar 28, 2026

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses

Xiao Li, Xiang Zheng, Yifeng Gao, Xinyu Xia +34 more

This survey provides a comprehensive, structured review of safety research in Embodied AI, analyzing attacks and defenses across the entire embodied pipeline to guide the development of safe, robust,…

View →
cs.NIcs.CRRecentMar 24, 2026

Digital Twin Enabled Simultaneous Learning and Modeling for UAV-assisted Secure Communications with Eavesdropping Attacks

Jieting Yuan, Songhan Zhao, Ye Xue, Yu Zhao +2 more

The paper proposes a Digital Twin-enabled Simultaneous Learning and Modeling (DT-SLAM) framework to enhance secure communications in UAV-assisted networks against intelligent eavesdropping attacks, ac…

View →
cs.CRcs.AIRecentMar 18, 2026

WebPII: Benchmarking Visual PII Detection for Computer-Use Agents

Nathan Zhao

The paper introduces WebPII, a novel, large-scale synthetic benchmark for detecting personally identifiable information (PII) in web screenshots, and demonstrates a model (WebRedact) that significantl…

View →