Ao Wang

50 indexed papers

Recent (6 mo)

With code

Influential cites

Benchmarked

Publications per year

Top categories

AI×32Crypto×20ML×10NLP×9Vision×4Info Retrieval×2Algorithms×1Society×1

Frequent co-authors

Hao Wang8×

Zihao Wang4×

Wenhao Wang3×

Yu Chen3×

Yuhao Wang2×

Zhao Liu2×

Research Timeline

2026

BAGEN: Are LLM Agents Budget-Aware?

This paper introduces the concept of Budget-Aware Agents (BAGEN), showing that current LLM agents often fail to manage resources proactively, and proposes that incorporating early stop and interval estimation significantly improves efficiency.

Appropriateness of Empathy in AI: A Signal-Cost Perspective

This paper proposes a Signal Cost Proxy framework, drawing from signaling theory, to systematically evaluate the contextual appropriateness of empathy in AI interactions.

AMix-2: Establishing Protein as a Native Modality in Large Language Models

The paper introduces AMix-2, a novel protein-text foundation model that unifies protein understanding and sequence design by embedding both modalities in a shared token space, achieving state-of-the-art performance on comprehensive benchmarks.

A physics-informed foundation model for quantitative diffusion MRI

The paper introduces PIGMENT, a physics-informed foundation model that enables reliable quantitative mapping of brain microstructure from extremely sparse or challenging diffusion MRI scans.

A Unified and Reproducible Experimentation Framework for Speech Understanding

The paper introduces SURE, a unified framework designed to standardize and improve the comparability and reproducibility of evaluations for advanced speech understanding models.

PatchWorld: Gradient-Free Optimization of Executable World Models

PatchWorld introduces a gradient-free framework to create executable Python world models from offline trajectories, achieving high planning scores by inducing symbolic belief-state programs.

I-WebGenBench : Evaluating Interactivity in LLM-Generated Scientific Web Applications

The paper introduces I-WebGenBench, a framework and benchmark that converts static scientific papers into executable, interactive web systems, allowing users to dynamically explore the paper's mechanisms.

Confused ChatGPT: Cross-App Context Poisoning via First-Party APIs

The paper identifies and demonstrates a novel vulnerability, cross-app context poisoning, in the shared context architecture of ChatGPT Apps, allowing malicious apps to manipulate the LLM's behavior across different, benign co-resident apps.

Large Language Models in Transportation Systems Management and Operations: From Text Reasoning to Multi-modal Decision Support

This survey reviews how Large and Multi-modal Language Models (LLMs/MM-LLMs) are being applied to integrate diverse data sources for enhanced decision support in transportation systems management and operations.

SafeSteer: Localized On-Policy Distillation for Efficient Safety Alignment

SafeSteer proposes a localized on-policy distillation method that restricts safety alignment to specific safety tokens, thereby achieving strong safety performance with minimal degradation to general capabilities and significantly reducing data requirements.

MCP-Persona: Benchmarking LLM Agents on Real-World Personal Applications via Environment Simulation

The paper introduces MCP-Persona, a novel benchmark designed to evaluate LLM agents' performance on real-world, personalized applications using the Model Context Protocol (MCP), revealing that current state-of-the-art agents struggle with such personalized tool use.

Geometry-Aware Implicit Memory for Video World Models

The paper proposes GIM-World, a geometry-aware implicit memory framework that significantly improves long-horizon video world models by explicitly encoding 3D scene geometry into a compact memory state.

Do Multimodal Agents Really Benefit from Tool Use? A Systematic Study of Capability Gains

The paper argues that observed gains in multimodal agents using tools may be due to learning tool-calling patterns rather than genuine capability expansion, finding that tool access provides little consistent aggregate improvement.

SafeMCP: Proactive Power Regulation for LLM Agent Defense via Environment-Grounded Look-Ahead Reasoning

SafeMCP is a server-side defense plugin that uses look-ahead reasoning to proactively filter and constrain tool acquisition for LLM agents, thereby mitigating catastrophic risks associated with expanding action spaces.

Shortcut to Nowhere: Demystifying Deep Spurious Regression

The paper introduces Deep Spurious Regression (DSR) to address spurious correlations in continuous prediction tasks, proposing a method that exploits attribute similarity in both feature and label spaces for robust generalization.

GJDNet: Robust Graph Neural Networks via Joint Disentangled Learning Against Adversarial Attacks

GJDNet proposes a joint disentanglement framework to enhance the robustness of Graph Neural Networks against adversarial attacks by simultaneously stabilizing node representations and decision boundaries across diverse graph connectivity types.

ResMerge: Residual-based Spectral Merging of Large Language Models

ResMerge proposes a residual-based spectral merging framework that improves the combination of multiple reinforcement learning (RL) expert models by stabilizing the aggregation process using a residual backbone.

PC Layer: Polynomial Weight Preconditioning for Improving LLM Pre-Training

This paper proposes a preconditioning layer for stable weight conditioning in LLM training.

OneReason Technical Report

The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coherent latent interests.

Multi-Objective Submodular Maximization with Differential Privacy

This paper addresses the challenging problem of multi-objective submodular maximization under a cardinality constraint while ensuring differential privacy, proposing novel algorithms with approximation guarantees.

Highlighted terms show continued research focus across papers

Papers

cs.LGcs.AIEmpiricalRecentJun 4, 2026

PC Layer: Polynomial Weight Preconditioning for Improving LLM Pre-Training

Senmiao Wang, Tiantian Fang, Haoran Zhang, Yushun Zhang +3 more

This paper proposes a preconditioning layer for stable weight conditioning in LLM training.

View →

cs.IRcs.AIcs.CLRecentJun 4, 2026