Hao Wang

50 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×28Crypto×13ML×10NLP×7Vision×7Algorithms×2Robotics×2Info Retrieval×2

Frequent co-authors

Zihao Wang6×

Chao Wang4×

Wenhao Wang4×

Xinchao Wang3×

Yuhao Wang3×

Yi Yang2×

Research Timeline

2026

SafeSteer: Localized On-Policy Distillation for Efficient Safety Alignment

SafeSteer proposes a localized on-policy distillation method that restricts safety alignment to specific safety tokens, thereby achieving strong safety performance with minimal degradation to general capabilities and significantly reducing data requirements.

MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation

The paper introduces MCP-Persona, a novel benchmark designed to evaluate LLM agents' performance on real-world, personalized applications using the Model Context Protocol (MCP), revealing that current state-of-the-art agents struggle with such personalized tool use.

OneReason Technical Report

The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coherent latent interests.

Multi-Objective Submodular Maximization with Differential Privacy

This paper addresses the challenging problem of multi-objective submodular maximization under a cardinality constraint while ensuring differential privacy, proposing novel algorithms with approximation guarantees.

Process Advantage Signal Shaping: A Paradigm-Agnostic Middleware for Process-Supervised RL in LLM Reasoners

This paper introduces PASS (Process Advantage Signal Shaping), a method to address three pathologies in GRPO (Group Relative Policy Optimization) for process-supervised reinforcement learning of LLM reasoners.

What Memory Do GUI Agents Really Need? From Passive Records to Active Task-Driving States

This paper introduces Active Task Driving Memory (ATMem), an actively maintained execution state for mobile GUI agents, and STR-GRPO, an online reinforcement learning method that uses ATMem selectively.

ERA: Entropy-Guided Visual Token Pruning with Rectified Attention for Efficient MLLMs

This paper proposes ERA, a framework for efficient multimodal large language models using entropy-guided visual token pruning, rectified attention, and bias-aware token recycling.

Flex-Forcing: Towards a Unified Autoregressive and Bidirectional Video Diffusion Model

This paper introduces Flex-Forcing, a framework for video generation that enables a model to operate under both bidirectional and autoregressive generation regimes, achieving better video quality and faster inference than existing methods.

LBR: Towards Mitigating Length Bias in Large Language Models for Recommendation

This paper proposes LBR, a framework to mitigate length bias in large language model-based recommendation systems.

Infinite Worlds with Versatile Interactions

The paper introduces LingBot-World 2.0, an advanced version of a language model with unbounded interaction horizon, rapid response time, diverse interactive elements, and agentic harness integration.

Evolutionary Intelligence for Scientific Discovery: From Evolutionary Computation to Cumulative Discovery Systems

This paper proposes Evolutionary Intelligence (EI) for scientific discovery, which links candidate refinement with experience retention across evolutionary cycles.

BadWAM: When World-Action Models Dream Right but Act Wrong

This paper introduces BadWAM, a framework for modeling and evaluating World-Action Drift Attacks, a new class of adversarial attacks that break the alignment between a World-Action Model's (WAM's) imagined future and its executed actions.

CRAFT: Clustering Rubrics to Diagnose Weak LLM Capabilities and Generate Targeted Fine-Tuning Data

The paper introduces CRAFT, a method for converting evaluation datasets into model-specific diagnoses of weak capabilities, achieving stronger results than baselines on four open source models and two professional domains.

CodeRescue: Budget-Calibrated Recovery Routing for Coding Agents

This paper proposes a recovery routing system for coding agents that uses a supervised router and Conformal Risk Control (CRC) layer to determine when to spend more compute or escalate to a stronger model.

Robots Acquire Manipulation Skills in Seconds from a Single Human Video

A new framework called HOST enables robots to acquire new skills from a single human video in seconds while retaining previously mastered skills.

Cross-Tokenizer On-Policy Distillation via Byte-Prefix Marginalization

This paper introduces Byte-Prefix Marginalization (BPM), a method for consolidating open-weight language models through on-policy distillation while preserving teacher probability mass and maintaining a shared byte space.

Twins: Learn to Predict Unified Representations with Focal Loss

This paper proposes Twins, a unified continuous token space for multimodal models using ViT and VAE features, and addresses optimization imbalance with a focal regression objective.

SpecBox: Speculative Sandbox Scheduling for Efficient LLM Agent Serving

This paper presents SpecBox, a runtime system for LLM agents that uses speculative sandbox preallocation to improve resource utilization and reduce interactive tail latency.

Optimization of the directed spanning trees using the weighted matroid intersection algorithm

The paper presents an algorithm for updating a Directed Minimum Spanning Tree using the weighted matroid intersection algorithm and a dynamic auxiliary graph.

Algorithmic Separation between Constant-Depth and Logarithmic-Depth Neural Networks

This paper provides the first algorithmic separation between constant-depth and logarithmic-depth networks, identifying a class of Boolean functions that logarithmic-depth networks can learn efficiently and exhibiting a subclass for which constant-depth networks incur constant approximation error.

Highlighted terms show continued research focus across papers

Papers

cs.DSmath.CONEWTheoreticalJul 28, 2026

Optimization of the directed spanning trees using the weighted matroid intersection algorithm

Binhong Jiang, Gehao Wang

The paper presents an algorithm for updating a Directed Minimum Spanning Tree using the weighted matroid intersection algorithm and a dynamic auxiliary graph.

View →

cs.LGstat.ML