Yu Zhang
45 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper proposes FluxMem, a novel connectivity-evolving memory framework that models memory as a dynamic graph to improve LLM agent performance in complex, changing environments.
The paper introduces MemTrace, a framework that treats LLM memory pipelines as traceable graphs to systematically diagnose and automatically correct memory-related errors, boosting performance by up to 7.62%.
The paper introduces VeriTrip, a new verifiable benchmark that evaluates travel planning agents' ability to perform evidence-grounded reasoning over complex, unstructured, and multimodal web data, revealing a critical retrieval-reasoning trade-off.
The paper introduces Global PSRO, a novel deep reinforcement learning framework that efficiently approximates Nash equilibria in large two-player zero-sum games by intelligently expanding the strategy set using a metric called Population Exploitability.
KLineage introduces a novel method to teach LLMs when and how to apply GPU kernel optimizations by reverse-engineering expert kernel lineages, resulting in superior optimization skills compared to existing baselines.
The paper introduces SNARE, a novel adaptive testing pipeline that systematically measures overeager behavior in coding agents, finding that the agent framework accounts for the majority of the variation in security risk.
The paper introduces MIRAGE, a novel pipeline that generates context-aware prompt injection attacks by injecting malicious text into user-generated content regions of mobile screenshots, successfully demonstrating the vulnerability of current GUI agents.
The paper introduces LongDS, a new benchmark for long-horizon, multi-turn data analysis, demonstrating that current AI agents struggle significantly with maintaining and updating complex analytical states over extended interactions.
The paper introduces SchGen, the first large language model capable of generating editable PCB schematics from natural language by using a novel semantically grounded code representation.
The paper quantifies the exact parametric memory capacity of LLMs using LoRA and proposes a new optimization strategy, MemFT, to enhance memory fidelity.
The paper introduces NaRA, a noise-aware LoRA technique that dynamically adapts fine-tuning parameters based on the noise level during diffusion, significantly improving the performance of Diffusion LLMs.
The paper analyzes observation masking in long-horizon search agents, finding that its effectiveness depends on a complex interaction between the model's capacity and the retriever's strength, exhibiting an inverted-U shaped gain.
The paper proposes SCALE, a self-improving web agent framework that uses adversarial roles and graph exploration to autonomously discover agent limitations and enhance adaptability in complex web environments.
The paper introduces BioConCal, a supervised scoring mechanism that evaluates biomedical NER candidates surfaced by multiple LLMs, significantly improving the quality of the candidate pool for human curators.
The paper introduces CoRP, a gradient-free operator that consolidates the benefits of ensemble-based post-training methods into a single, deployable model update, significantly improving performance with minimal computational overhead.
The paper introduces ConsisGuard, a framework that addresses the 'deliberation-to-enforcement gap' in LLM guardrails by ensuring that the reasoning process is faithfully and consistently translated into the final safety decision.
The paper proposes DAG-MoE, a novel sparse Mixture-of-Experts framework that replaces standard weighted-sum aggregation with structural aggregation to enhance model performance and enable multi-step reasoning.
The paper proposes a sequence-alignment framework using Soft Dynamic Time Warping to evaluate audio-driven talking-head generation, demonstrating that this approach provides more robust and fair comparisons than traditional frame-wise metrics.
The paper introduces SGAP-PPIS, a structure-guided adaptive propagation model that improves protein-protein interaction site prediction by allowing information diffusion to adapt based on a residue's local geometric environment.
This paper introduces Imaginative Perception Tokens (IPT) to improve spatial reasoning in vision language models.
Papers
Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models
Mahtab Bigverdi, Lindsey Li, Weikai Huang, Yiming Liu +7 more
This paper introduces Imaginative Perception Tokens (IPT) to improve spatial reasoning in vision language models.