Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Ao Zhang

Ao Zhang

50 indexed papers

Recent (6 mo)
50
With code
0
Influential cites
0
Benchmarked
0

Publications per year

50
26

Top categories

AI×30Crypto×15NLP×14Vision×5Robotics×4ML×4Emerging Tech×3Info Retrieval×2

Frequent co-authors

Wentao Zhang5×
Luyao Zhang3×
Qingzhao Zhang3×
Yan Wang2×
Chao Zhang2×
Yuhao Zhang2×

Research Timeline

2026
From Noise to Control: Parameterized Diffusion Policies

The Parameterized Diffusion Policy (PDP) framework transforms diffusion models from general stochastic generators into precise, steerable tools for learning and adapting complex robotic behaviors by embedding them on a semantically structured latent manifold.

BAGEN: Are LLM Agents Budget-Aware?

This paper introduces the concept of Budget-Aware Agents (BAGEN), showing that current LLM agents often fail to manage resources proactively, and proposes that incorporating early stop and interval estimation significantly improves efficiency.

Learning to Adapt: Self-Improving Web Agent via Cognitive-Aware Exploration

The paper proposes SCALE, a self-improving web agent framework that uses adversarial roles and graph exploration to autonomously discover agent limitations and enhance adaptability in complex web environments.

DeMaVLA: A Vision-Language-Action Foundation Model for Generalizable Deformable Manipulation

DeMaVLA is a generalizable Vision-Language-Action foundation model designed for deformable object manipulation, achieving strong real-world performance on folding tasks by leveraging large-scale real-world data and corrective learning.

Beyond Agreement: Scoring Panel-Surfaced Biomedical Entity Candidates for Curator Triage

The paper introduces BioConCal, a supervised scoring mechanism that evaluates biomedical NER candidates surfaced by multiple LLMs, significantly improving the quality of the candidate pool for human curators.

GaMi: Geometry-Agnostic Material Identification via Cross-Modal Subtractive Disentanglement

GaMi is a multimodal material identification system that uses mmWave and acoustic sensing with a cross-modal subtractive disentanglement framework to achieve high accuracy (95.2%) for material identification regardless of geometric variations.

Learning Agent-Compatible Context Management for Long-Horizon Tasks

The paper introduces Adaptive Context Management (AdaCoM), an external context manager that uses reinforcement learning to improve the performance of frozen LLM agents on long-horizon tasks by intelligently managing and pruning accumulated context.

Seeing Before Agreeing: Aligning Multi-Agent Consensus with Visual Evidence

The paper proposes EAGLE, a novel evidence-aligned multi-agent framework, demonstrating that requiring shared visual evidence among agents is crucial for achieving reliable and trustworthy consensus in multimodal Visual Question Answering (VQA).

UniAudio-Token: Empowering Semantic Speech Tokenizers with General Audio Perception

UniAudio-Token is a framework that enhances existing semantic speech tokenizers with general audio perception, allowing them to handle diverse audio types while maintaining high-fidelity speech capabilities.

Richer Representations for Neural Algorithmic Reasoning via Auxiliary Reconstruction

The paper proposes using an auxiliary reconstruction task, specifically one that captures intra-state feature dependencies, to improve the quality of state representations learned by the encoder in neural algorithmic reasoning.

ProactiveLLM: Learning Active Interaction for Streaming Large Language Models

ProactiveLLM introduces a novel framework that enables streaming LLMs to actively decide when to interact with incoming data by leveraging the model's internal states, significantly reducing latency while maintaining quality.

ANDES: Agent Native Data Evolving Synthesis Tool for Autonomous Instruction Alignment

The paper introduces Andes, a framework that treats data generation as a plug-and-play agent skill, enabling autonomous alignment of LLMs by providing an intelligent, closed-loop data synthesis interface.

SkillRevise: Improving LLM-Authored Agent Skills via Trace-Conditioned Skill Revision

SkillRevise is an execution-grounded framework that iteratively refines initial, imperfect LLM agent skills by diagnosing defects from execution evidence and applying empirically validated edits, significantly boosting agent performance.

MiCU: End-to-End Smart Home Command Understanding with Large Language Model

The paper introduces MiCU, a domain-specific LLM that significantly improves smart home command understanding, especially for ambiguous commands, by synthesizing training data and optimizing the model for efficiency.

Implicit Drifting Policy: One-Step Action Generation via Conditional Expert Geometry

The Implicit Drifting Policy (IDP) is a novel one-step action generation framework that implicitly enforces trajectory correction constraints by analyzing local expert action geometry, overcoming the difficulties of explicitly estimating a training-time drifting field.

SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction

The paper introduces SkillHarm, a comprehensive benchmark and automated framework for evaluating skill-based attacks across the entire agent skill-use lifecycle, demonstrating that current agents remain highly vulnerable to both fixed-payload and self-mutating poisoning attacks.

Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

The paper introduces Humanoid-GPT, a large-scale generative Transformer model that achieves robust zero-shot motion tracking and control by training on a massive, unified corpus of motion data.

QUBRIC: Co-Designing Queries and Rubrics for RL Beyond Verifiable Rewards

QUBRIC introduces a co-design framework that simultaneously optimizes queries and rubrics, overcoming the bottleneck of vague rubrics derived from open-ended questions, leading to significant gains in RL performance.

OneReason Technical Report

The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coherent latent interests.

CORE-Bench: A Comprehensive Benchmark for Code Retrieval in the Era of Agentic Coding

This paper introduces CORE-Bench, a comprehensive benchmark for code retrieval in agentic coding.

Highlighted terms show continued research focus across papers

Papers

cs.IREmpiricalRecentJun 10, 2026

CORE-Bench: A Comprehensive Benchmark for Code Retrieval in the Era of Agentic Coding

Fuwei Zhang, Yanzhao Zhang, Mingxin Li, Dingkun Long +4 more

This paper introduces CORE-Bench, a comprehensive benchmark for code retrieval in agentic coding.

View →
cs.IRcs.AIcs.CLRecentJun 4, 2026

OneReason Technical Report

OneRec Team, Biao Yang, Boyang Ding, Chenglong Chu +80 more

The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coheren…

View →
cs.ROcs.AIcs.CVRecentJun 2, 2026

Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

Zekun Qi, Xuchuan Chen, Dairu Liu, Chenghuai Lin +9 more

The paper introduces Humanoid-GPT, a large-scale generative Transformer model that achieves robust zero-shot motion tracking and control by training on a massive, unified corpus of motion data.

View →
cs.CLcs.AIRecentJun 2, 2026

QUBRIC: Co-Designing Queries and Rubrics for RL Beyond Verifiable Rewards

Rongzhi Zhang, Rui Feng, Zhihan Zhang, Jingfeng Yang +7 more

QUBRIC introduces a co-design framework that simultaneously optimizes queries and rubrics, overcoming the bottleneck of vague rubrics derived from open-ended questions, leading to significant gains in…

View →
cs.CLRecentJun 1, 2026

SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction

Yuting Ning, Zhehao Zhang, Yash Kumar Lal, Boyu Gou +7 more

The paper introduces SkillHarm, a comprehensive benchmark and automated framework for evaluating skill-based attacks across the entire agent skill-use lifecycle, demonstrating that current agents rema…

View →
cs.AIRecentMay 31, 2026

ANDES: Agent Native Data Evolving Synthesis Tool for Autonomous Instruction Alignment

Zhengyang Zhao, Shengjie Ye, Lu Ma, Hao Liang +2 more

The paper introduces Andes, a framework that treats data generation as a plug-and-play agent skill, enabling autonomous alignment of LLMs by providing an intelligent, closed-loop data synthesis interf…

View →
cs.AIRecentMay 31, 2026

SkillRevise: Improving LLM-Authored Agent Skills via Trace-Conditioned Skill Revision

Yuxuan Liu, Zhaochen Su, Lingyun Xie, Yuhao Zhang +10 more

SkillRevise is an execution-grounded framework that iteratively refines initial, imperfect LLM agent skills by diagnosing defects from execution evidence and applying empirically validated edits, sign…

View →
cs.CLcs.AIRecentMay 31, 2026

MiCU: End-to-End Smart Home Command Understanding with Large Language Model

Haowei Han, Kexin Hu, Weiwei Cai, Debiao Zhang +5 more

The paper introduces MiCU, a domain-specific LLM that significantly improves smart home command understanding, especially for ambiguous commands, by synthesizing training data and optimizing the model…

View →
cs.ROcs.AIRecentMay 31, 2026

Implicit Drifting Policy: One-Step Action Generation via Conditional Expert Geometry

Zemin Yang, Yaoyu He, Yiming Zhong, Yuhao Zhang +4 more

The Implicit Drifting Policy (IDP) is a novel one-step action generation framework that implicitly enforces trajectory correction constraints by analyzing local expert action geometry, overcoming the…

View →
cs.LGcs.AIRecentMay 30, 2026

Richer Representations for Neural Algorithmic Reasoning via Auxiliary Reconstruction

Jiafu Huang, Chao Peng, Chenyang Xu, Zhengfeng Yang +6 more

The paper proposes using an auxiliary reconstruction task, specifically one that captures intra-state feature dependencies, to improve the quality of state representations learned by the encoder in ne…

View →
cs.CLRecentMay 30, 2026

ProactiveLLM: Learning Active Interaction for Streaming Large Language Models

Junlong Tong, Yao Zhang, Anhao Zhao, Yingqi Fan +2 more

ProactiveLLM introduces a novel framework that enables streaming LLMs to actively decide when to interact with incoming data by leveraging the model's internal states, significantly reducing latency w…

View →
cs.AIcs.LGRecentMay 29, 2026

From Noise to Control: Parameterized Diffusion Policies

Renhao Zhang, Haotian Fu, Mingxi Jia, George Konidaris +2 more

The Parameterized Diffusion Policy (PDP) framework transforms diffusion models from general stochastic generators into precise, steerable tools for learning and adapting complex robotic behaviors by e…

View →
cs.LGcs.AIcs.CLRecentMay 29, 2026

BAGEN: Are LLM Agents Budget-Aware?

Yuxiang Lin, Zihan Wang, Mengyang Liu, Yuxuan Shan +8 more

This paper introduces the concept of Budget-Aware Agents (BAGEN), showing that current LLM agents often fail to manage resources proactively, and proposes that incorporating early stop and interval es…

View →
cs.AIRecentMay 29, 2026

Learning to Adapt: Self-Improving Web Agent via Cognitive-Aware Exploration

Weile Chen, Bingchen Miao, Qifan Yu, Wendong Bu +5 more

The paper proposes SCALE, a self-improving web agent framework that uses adversarial roles and graph exploration to autonomously discover agent limitations and enhance adaptability in complex web envi…

View →
cs.ROcs.AIRecentMay 29, 2026

DeMaVLA: A Vision-Language-Action Foundation Model for Generalizable Deformable Manipulation

Taiyi Su, Jian Zhu, Tianjian Wang, Youzhang He +8 more

DeMaVLA is a generalizable Vision-Language-Action foundation model designed for deformable object manipulation, achieving strong real-world performance on folding tasks by leveraging large-scale real-…

View →
cs.CLcs.AIRecentMay 29, 2026

Beyond Agreement: Scoring Panel-Surfaced Biomedical Entity Candidates for Curator Triage

Shuheng Cao, Ruiqi Chen, Renjie Cao, Zhenhao Zhang +2 more

The paper introduces BioConCal, a supervised scoring mechanism that evaluates biomedical NER candidates surfaced by multiple LLMs, significantly improving the quality of the candidate pool for human c…

View →
cs.ETcs.AIcs.SDRecentMay 29, 2026

GaMi: Geometry-Agnostic Material Identification via Cross-Modal Subtractive Disentanglement

Zhiwei Chen, Yijie Li, Yimo Zhang, Shiyun Shao +8 more

GaMi is a multimodal material identification system that uses mmWave and acoustic sensing with a cross-modal subtractive disentanglement framework to achieve high accuracy (95.2%) for material identif…

View →
cs.AIRecentMay 29, 2026

Learning Agent-Compatible Context Management for Long-Horizon Tasks

Lu Yi, Runlin Lei, Liuyi Yao, Yuexiang Xie +5 more

The paper introduces Adaptive Context Management (AdaCoM), an external context manager that uses reinforcement learning to improve the performance of frozen LLM agents on long-horizon tasks by intelli…

View →
cs.CVcs.AIcs.MARecentMay 29, 2026

Seeing Before Agreeing: Aligning Multi-Agent Consensus with Visual Evidence

Yuhan Wang, Shuochen Chang, Yalin Feng, Dongsheng Ma +7 more

The paper proposes EAGLE, a novel evidence-aligned multi-agent framework, demonstrating that requiring shared visual evidence among agents is crucial for achieving reliable and trustworthy consensus i…

View →
cs.CLcs.SDRecentMay 29, 2026

UniAudio-Token: Empowering Semantic Speech Tokenizers with General Audio Perception

Yuhan Song, Linhao Zhang, Aiwei Liu, Chuhan Wu +5 more

UniAudio-Token is a framework that enhances existing semantic speech tokenizers with general audio perception, allowing them to handle diverse audio types while maintaining high-fidelity speech capabi…

View →