Yue Zhang
13 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
Agent Audit is a novel security analysis system that comprehensively audits LLM agent applications by examining the entire software stack—including tool code, configuration, and prompts—to detect a wide range of vulnerabilities.
LightGuard introduces a dual-link architecture that uses a physically confined LiFi channel to securely bootstrap cryptographic session keys, thereby mitigating the risk of key exposure inherent in traditional open-air WiFi communication.
The paper proposes SAGE, a framework that uses Signal-Amplified Guided Embeddings to overcome 'Signal Submersion' in LLMs, significantly boosting vulnerability detection accuracy across multiple programming languages.
The paper proposes AgentDID, a decentralized framework using DIDs and verifiable credentials to provide trustless identity authentication and dynamic state verification for autonomous, self-managed AI agents.
The paper proposes SafeReview, a co-evolutionary adversarial training framework that significantly improves the robustness of LLM-based peer review systems against sophisticated adversarial hidden prompts.
SafeHarbor is a novel, hierarchical memory-augmented framework that establishes context-aware decision boundaries for LLM agents, achieving state-of-the-art safety while minimizing over-refusal.
MemPrivacy introduces a novel framework that protects sensitive user data in edge-cloud memory systems by replacing private spans with semantically structured placeholders, thereby minimizing data exposure without sacrificing memory utility.
This paper introduces UPAttack, a novel threat model demonstrating that focusing on explicit usability requirements can cause LLMs to generate insecure code by neglecting implicit security constraints, and proposes U-SPLOIT to automate this attack.
The paper proposes DMN, a compositional jailbreak framework that utilizes distributed instructions, multimodal evidence, and a number chain task across multiple images to significantly enhance the attack success rate against multimodal LLMs.
This paper introduces a new evaluation framework, SpatialUncertain, demonstrating that current Vision-Language Models (VLMs) are prone to overconfident and incorrect answers to spatial questions when visual evidence is incomplete or misleading.
The paper proposes Preference Delta Aggregation (PDA), a framework that aggregates multiple weak preference signals derived from smaller model pairs using LoRA merging to significantly boost the performance of a strong large language model.
FedMTFI introduces a novel federated learning framework that uses multi-teacher knowledge distillation and feature importance to improve model performance and robustness in heterogeneous and non-IID data environments.
The paper introduces the concept of Search-Time Contamination (STC), demonstrating that deep research agents can leak information from public benchmarks via web search, leading to an overestimation of their true reasoning ability.
Papers
Search-Time Contamination in Deep Research Agents: Measuring Performance Inflation in Public Benchmark Evaluation
Yongjie Wang, Xinyue Zhang, Kunhong Yao, Zhiwei Zeng +3 more
The paper introduces the concept of Search-Time Contamination (STC), demonstrating that deep research agents can leak information from public benchmarks via web search, leading to an overestimation of…