Hao Wang
46 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces $ ext{RLR}^3$, a novel framework that extends verifiable rewards in Reinforcement Learning to handle partially verifiable, multi-criteria vision-language tasks by integrating robust rubric scoring.
iLoRA introduces a novel Bayesian graph-conditioned LoRA framework that jointly learns prediction and latent interaction structure, significantly improving microbiome diagnosis by modeling microbe-microbe cross-talk.
The paper proposes HTP, a novel framework that leverages Large Language Models (LLMs) to first generate abstract travel patterns and then synthesize realistic GPS points, significantly improving trajectory generation quality over existing methods.
The paper introduces VikingMem, a novel Memory Base Management System that effectively manages the persistent state of long-term LLM interactions by selectively extracting, evolving, and compressing memories, significantly outperforming existing methods.
DP-SAPF introduces a saliency-aware parameter fine-tuning method that selectively identifies the most critical parameters for LoRA training, significantly improving the utility and fidelity of differentially private image synthesis while reducing computational cost.
The paper introduces AMix-2, a novel protein-text foundation model that unifies protein understanding and sequence design by embedding both modalities in a shared token space, achieving state-of-the-art performance on comprehensive benchmarks.
The paper introduces PIGMENT, a physics-informed foundation model that enables reliable quantitative mapping of brain microstructure from extremely sparse or challenging diffusion MRI scans.
The paper introduces SURE, a unified framework designed to standardize and improve the comparability and reproducibility of evaluations for advanced speech understanding models.
PatchWorld introduces a gradient-free framework to create executable Python world models from offline trajectories, achieving high planning scores by inducing symbolic belief-state programs.
dMoE proposes a block-level Mixture-of-Experts (MoE) framework for Diffusion Large Language Models (dLLMs) that aggregates token-level expert distributions into a unified block-level distribution, significantly reducing memory usage and improving inference speed.
The paper introduces I-WebGenBench, a framework and benchmark that converts static scientific papers into executable, interactive web systems, allowing users to dynamically explore the paper's mechanisms.
The paper identifies and demonstrates a novel vulnerability, cross-app context poisoning, in the shared context architecture of ChatGPT Apps, allowing malicious apps to manipulate the LLM's behavior across different, benign co-resident apps.
This survey reviews how Large and Multi-modal Language Models (LLMs/MM-LLMs) are being applied to integrate diverse data sources for enhanced decision support in transportation systems management and operations.
SafeSteer proposes a localized on-policy distillation method that restricts safety alignment to specific safety tokens, thereby achieving strong safety performance with minimal degradation to general capabilities and significantly reducing data requirements.
The paper introduces MCP-Persona, a novel benchmark designed to evaluate LLM agents' performance on real-world, personalized applications using the Model Context Protocol (MCP), revealing that current state-of-the-art agents struggle with such personalized tool use.
The paper argues that observed gains in multimodal agents using tools may be due to learning tool-calling patterns rather than genuine capability expansion, finding that tool access provides little consistent aggregate improvement.
SafeMCP is a server-side defense plugin that uses look-ahead reasoning to proactively filter and constrain tool acquisition for LLM agents, thereby mitigating catastrophic risks associated with expanding action spaces.
The paper introduces Deep Spurious Regression (DSR) to address spurious correlations in continuous prediction tasks, proposing a method that exploits attribute similarity in both feature and label spaces for robust generalization.
The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coherent latent interests.
This paper addresses the challenging problem of multi-objective submodular maximization under a cardinality constraint while ensuring differential privacy, proposing novel algorithms with approximation guarantees.
Papers
OneReason Technical Report
OneRec Team, Biao Yang, Boyang Ding, Chenglong Chu +80 more
The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coheren…