Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:
ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Home/Authors/Yu Zhang

Yu Zhang

45 indexed papers

Recent (6 mo)
45
With code
0
Influential cites
0
Benchmarked
0

Publications per year

45
26

Top categories

AI×33Crypto×25NLP×17ML×13Vision×3Multimedia×3Multiagent×3Software Eng.×3

Frequent co-authors

Leo Yu Zhang9×
Yi Liu7×
Gelei Deng7×
Yuekang Li7×
Ying Zhang7×
Ningyu Zhang5×

Research Timeline

2026
Rethinking Memory as Continuously Evolving Connectivity

The paper proposes FluxMem, a novel connectivity-evolving memory framework that models memory as a dynamic graph to improve LLM agent performance in complex, changing environments.

MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems

The paper introduces MemTrace, a framework that treats LLM memory pipelines as traceable graphs to systematically diagnose and automatically correct memory-related errors, boosting performance by up to 7.62%.

VeriTrip: A Verifiable Benchmark for Travel Planning Agents over Unstructured Web Corpora

The paper introduces VeriTrip, a new verifiable benchmark that evaluates travel planning agents' ability to perform evidence-grounded reasoning over complex, unstructured, and multimodal web data, revealing a critical retrieval-reasoning trade-off.

Global Policy-Space Response Oracles for Two-Player Zero-Sum Games

The paper introduces Global PSRO, a novel deep reinforcement learning framework that efficiently approximates Nash equilibria in large two-player zero-sum games by intelligently expanding the strategy set using a metric called Population Exploitability.

Learning When to Optimize: Verified Optimization Skills from Expert GPU-Kernel Lineages

KLineage introduces a novel method to teach LLMs when and how to apply GPU kernel optimizations by reverse-engineering expert kernel lineages, resulting in superior optimization skills compared to existing baselines.

SNARE: Adaptive Scenario Synthesis for Eliciting Overeager Behavior in Coding Agents

The paper introduces SNARE, a novel adaptive testing pipeline that systematically measures overeager behavior in coding agents, finding that the agent framework accounts for the majority of the variation in security risk.

MIRAGE: Context-Aware Prompt Injection against Mobile GUI Agents via User-Generated Content

The paper introduces MIRAGE, a novel pipeline that generates context-aware prompt injection attacks by injecting malicious text into user-generated content regions of mobile screenshots, successfully demonstrating the vulnerability of current GUI agents.

LongDS-Bench: On the Failure of Long-Horizon Agentic Data Analysis

The paper introduces LongDS, a new benchmark for long-horizon, multi-turn data analysis, demonstrating that current AI agents struggle significantly with maintaining and updating complex analytical states over extended interactions.

SchGen: PCB Schematic Generation with Semantic-Grounded Code Representations

The paper introduces SchGen, the first large language model capable of generating editable PCB schematics from natural language by using a novel semantically grounded code representation.

How LoRA Remembers? A Parametric Memory Law for LLM Finetuning

The paper quantifies the exact parametric memory capacity of LLMs using LoRA and proposes a new optimization strategy, MemFT, to enhance memory fidelity.

NaRA: Noise-Aware LoRA for Parameter-Efficient Fine-Tuning of Diffusion LLMs

The paper introduces NaRA, a noise-aware LoRA technique that dynamically adapts fine-tuning parameters based on the noise level during diffusion, significantly improving the performance of Diffusion LLMs.

Masking Stale Observations Helps Search Agents -- Until It Doesn't: A Regime Map and Its Mechanism

The paper analyzes observation masking in long-horizon search agents, finding that its effectiveness depends on a complex interaction between the model's capacity and the retriever's strength, exhibiting an inverted-U shaped gain.

Learning to Adapt: Self-Improving Web Agent via Cognitive-Aware Exploration

The paper proposes SCALE, a self-improving web agent framework that uses adversarial roles and graph exploration to autonomously discover agent limitations and enhance adaptability in complex web environments.

Beyond Agreement: Scoring Panel-Surfaced Biomedical Entity Candidates for Curator Triage

The paper introduces BioConCal, a supervised scoring mechanism that evaluates biomedical NER candidates surfaced by multiple LLMs, significantly improving the quality of the candidate pool for human curators.

Consolidating Rewarded Perturbations for LLM Post-Training

The paper introduces CoRP, a gradient-free operator that consolidates the benefits of ensemble-based post-training methods into a single, deployable model update, significantly improving performance with minimal computational overhead.

ConsisGuard: Aligning Safety Deliberation with Policy Enforcement in LLM Guardrails

The paper introduces ConsisGuard, a framework that addresses the 'deliberation-to-enforcement gap' in LLM guardrails by ensuring that the reasoning process is faithfully and consistently translated into the final safety decision.

DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts

The paper proposes DAG-MoE, a novel sparse Mixture-of-Experts framework that replaces standard weighted-sum aggregation with structural aggregation to enhance model performance and enable multi-step reasoning.

Temporally-Aligned Evaluation for Audio-Driven Talking Head Generation

The paper proposes a sequence-alignment framework using Soft Dynamic Time Warping to evaluate audio-driven talking-head generation, demonstrating that this approach provides more robust and fair comparisons than traditional frame-wise metrics.

Structure-Guided Adaptive Propagation for Protein-Protein Interaction Site Prediction

The paper introduces SGAP-PPIS, a structure-guided adaptive propagation model that improves protein-protein interaction site prediction by allowing information diffusion to adapt based on a residue's local geometric environment.

Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models

This paper introduces Imaginative Perception Tokens (IPT) to improve spatial reasoning in vision language models.

Highlighted terms show continued research focus across papers

Papers

cs.AIRecentJun 2, 2026

Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models

Mahtab Bigverdi, Lindsey Li, Weikai Huang, Yiming Liu +7 more

This paper introduces Imaginative Perception Tokens (IPT) to improve spatial reasoning in vision language models.

View →
cs.AIRecentJun 1, 2026

Structure-Guided Adaptive Propagation for Protein-Protein Interaction Site Prediction

Enqiang Zhu, Yizi Liu, Yilong Luo, Yao Chen +2 more

The paper introduces SGAP-PPIS, a structure-guided adaptive propagation model that improves protein-protein interaction site prediction by allowing information diffusion to adapt based on a residue's…

View →
cs.AIRecentMay 31, 2026

DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts

Jiarui Feng, Hanqing Zeng, Karish Grover, Ruizhong Qiu +10 more

The paper proposes DAG-MoE, a novel sparse Mixture-of-Experts framework that replaces standard weighted-sum aggregation with structural aggregation to enhance model performance and enable multi-step r…

View →
cs.GRcs.AIcs.CVRecentMay 31, 2026

Temporally-Aligned Evaluation for Audio-Driven Talking Head Generation

Zhicheng Zhang, Lei Wang, Yu Zhang, Yongsheng Gao

The paper proposes a sequence-alignment framework using Soft Dynamic Time Warping to evaluate audio-driven talking-head generation, demonstrating that this approach provides more robust and fair compa…

View →
cs.CLcs.AIcs.IRRecentMay 29, 2026

Masking Stale Observations Helps Search Agents -- Until It Doesn't: A Regime Map and Its Mechanism

Haoxiang Zhang, Qixin Xu, Zhuofeng Li, Lei Zhang +3 more

The paper analyzes observation masking in long-horizon search agents, finding that its effectiveness depends on a complex interaction between the model's capacity and the retriever's strength, exhibit…

View →
cs.AIRecentMay 29, 2026

Learning to Adapt: Self-Improving Web Agent via Cognitive-Aware Exploration

Weile Chen, Bingchen Miao, Qifan Yu, Wendong Bu +5 more

The paper proposes SCALE, a self-improving web agent framework that uses adversarial roles and graph exploration to autonomously discover agent limitations and enhance adaptability in complex web envi…

View →
cs.CLcs.AIRecentMay 29, 2026

Beyond Agreement: Scoring Panel-Surfaced Biomedical Entity Candidates for Curator Triage

Shuheng Cao, Ruiqi Chen, Renjie Cao, Zhenhao Zhang +2 more

The paper introduces BioConCal, a supervised scoring mechanism that evaluates biomedical NER candidates surfaced by multiple LLMs, significantly improving the quality of the candidate pool for human c…

View →
cs.CLcs.LGRecentMay 29, 2026

Consolidating Rewarded Perturbations for LLM Post-Training

Zheyu Zhang, Shuo Yang, Gjergji Kasneci

The paper introduces CoRP, a gradient-free operator that consolidates the benefits of ensemble-based post-training methods into a single, deployable model update, significantly improving performance w…

View →
cs.CLRecentMay 29, 2026

ConsisGuard: Aligning Safety Deliberation with Policy Enforcement in LLM Guardrails

Yan Wang, Zhixuan Chu, Zihao Xue, Zhen Bi +8 more

The paper introduces ConsisGuard, a framework that addresses the 'deliberation-to-enforcement gap' in LLM guardrails by ensuring that the reasoning process is faithfully and consistently translated in…

View →
cs.LGcs.AIcs.CLRecentMay 28, 2026

LongDS-Bench: On the Failure of Long-Horizon Agentic Data Analysis

Kewei Xu, Xiaoben Lu, Shuofei Qiao, Zihan Ding +3 more

The paper introduces LongDS, a new benchmark for long-horizon, multi-turn data analysis, demonstrating that current AI agents struggle significantly with maintaining and updating complex analytical st…

View →
cs.AIcs.CLcs.LGRecentMay 28, 2026

SchGen: PCB Schematic Generation with Semantic-Grounded Code Representations

Qinpei Luo, Ruichun Ma, Xinyu Zhang, Lili Qiu

The paper introduces SchGen, the first large language model capable of generating editable PCB schematics from natural language by using a novel semantically grounded code representation.

View →
cs.CLcs.AIcs.CVRecentMay 28, 2026

How LoRA Remembers? A Parametric Memory Law for LLM Finetuning

Ziwen Xu, Haiwen Hong, Linsong Yu, Benglei Cui +3 more

The paper quantifies the exact parametric memory capacity of LLMs using LoRA and proposes a new optimization strategy, MemFT, to enhance memory fidelity.

View →
cs.AIRecentMay 28, 2026

NaRA: Noise-Aware LoRA for Parameter-Efficient Fine-Tuning of Diffusion LLMs

Shuaidi Wang, Zhan Zhuang, Ruping Huang, Yu Zhang

The paper introduces NaRA, a noise-aware LoRA technique that dynamically adapts fine-tuning parameters based on the noise level during diffusion, significantly improving the performance of Diffusion L…

View →
cs.CLcs.AIcs.LGRecentMay 27, 2026

Rethinking Memory as Continuously Evolving Connectivity

Jizhan Fang, Buqiang Xu, Zhixian Wang, Haoliang Cao +11 more

The paper proposes FluxMem, a novel connectivity-evolving memory framework that models memory as a dynamic graph to improve LLM agent performance in complex, changing environments.

View →
cs.CLcs.AIcs.LGRecentMay 27, 2026

MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems

Xinle Deng, Ruobin Zhong, Hujin Peng, Xiaoben Lu +14 more

The paper introduces MemTrace, a framework that treats LLM memory pipelines as traceable graphs to systematically diagnose and automatically correct memory-related errors, boosting performance by up t…

View →
cs.AIRecentMay 27, 2026

VeriTrip: A Verifiable Benchmark for Travel Planning Agents over Unstructured Web Corpora

Yuting Xu, Jiayi Tian, Jian Liang, Xin Xiong +3 more

The paper introduces VeriTrip, a new verifiable benchmark that evaluates travel planning agents' ability to perform evidence-grounded reasoning over complex, unstructured, and multimodal web data, rev…

View →
cs.AIRecentMay 27, 2026

Global Policy-Space Response Oracles for Two-Player Zero-Sum Games

Junyu Zhang, Feihong Yang, Jian Wang, Chao Wang +1 more

The paper introduces Global PSRO, a novel deep reinforcement learning framework that efficiently approximates Nash equilibria in large two-player zero-sum games by intelligently expanding the strategy…

View →
cs.AIRecentMay 27, 2026

Learning When to Optimize: Verified Optimization Skills from Expert GPU-Kernel Lineages

Shuoming Zhang, Qiuchu Yu, Yangyu Zhang, Ruiyuan Xu +5 more

KLineage introduces a novel method to teach LLMs when and how to apply GPU kernel optimizations by reverse-engineering expert kernel lineages, resulting in superior optimization skills compared to exi…

View →
cs.CRcs.AIcs.CLRecentMay 27, 2026

SNARE: Adaptive Scenario Synthesis for Eliciting Overeager Behavior in Coding Agents

Yubin Qu, Yi Liu, Gelei Deng, Yanjun Zhang +3 more

The paper introduces SNARE, a novel adaptive testing pipeline that systematically measures overeager behavior in coding agents, finding that the agent framework accounts for the majority of the variat…

View →
cs.CRcs.AIcs.CLRecentMay 27, 2026

MIRAGE: Context-Aware Prompt Injection against Mobile GUI Agents via User-Generated Content

Ruoqi Guo, Yi Liu, Gelei Deng, Yiheng Xiong +6 more

The paper introduces MIRAGE, a novel pipeline that generates context-aware prompt injection attacks by injecting malicious text into user-generated content regions of mobile screenshots, successfully…

View →