Jun Li
22 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
This paper systematically analyzes the resilience of LLM-enhanced search engines against black-hat SEO attacks, finding that while they block most traditional attacks, they remain vulnerable to sophisticated LLM-generated query manipulations.
The paper introduces SET, a robust input-level backdoor detection framework that detects hidden malicious triggers in text-to-image diffusion models by analyzing systematic differences in how benign and backdoor inputs respond to controlled cross-attention scaling perturbations.
The paper introduces TICoE, a text-image collaborative framework that achieves precise and faithful concept removal from text-to-image generative models, surpassing existing methods in both precision and content fidelity.
PRAG is an end-to-end privacy-preserving Retrieval-Augmented Generation (RAG) system that maintains high retrieval accuracy and scalability in cloud environments by encrypting both documents and queries.
The paper introduces APIOT, the first LLM framework capable of autonomously performing the full discovery, exploitation, patching, and verification cycle against bare-metal industrial OT devices.
SecureForge is an automated pipeline that significantly reduces cybersecurity vulnerabilities in LLM-generated code by optimizing system prompts, achieving up to a 48% reduction in output vulnerabilities.
The paper proposes M extsuperscript{3}Att, a knowledge-poisoning framework that injects covert misinformation into medical multimodal RAG systems using paired visual data triggers, demonstrating attacks that generate clinically plausible but incorrect diagnoses.
This paper introduces the 'wide-net-casting' jailbreak scenario, demonstrating that querying a group of large language models can expose significant, previously overlooked safety risks, with a novel method achieving 100% jailbreak success in some tests.
The paper introduces POLARIS, a novel framework that systematically generates comprehensive and verifiable safety tests for LLMs by formalizing natural language policies into First-Order Logic and exploring the resulting Semantic Policy Graph.
The paper proposes HiSME, a lightweight hierarchical skill meta-evolving solution that jointly optimizes skills and the skill evolving strategy by learning meta-skills from task execution traces, leading to improved agent performance.
The paper introduces a unified framework to fairly evaluate LLM agentic capabilities by standardizing diverse benchmarks and separating the effects of the LLM model from the surrounding framework and environment.
SmartDirector is a novel framework that significantly improves cinematic video generation by using multiple keyframes to provide precise control over narrative structure and temporal pacing.
The paper introduces AgentSchool, an advanced LLM-powered multi-agent simulator that models learning as state transitions to provide a robust, ethically viable testbed for educational research and pedagogical reform.
This study analyzes over 20,000 real-world coding sessions to show that AI coding agents frequently fail users through subtle misalignment, requiring constant manual correction even when major system damage is avoided.
AutoSci is a memory-centric agentic system designed to automate the entire scientific research lifecycle by integrating structured memory, multi-stage execution, and continuous self-improvement.
The paper proposes GUIDE, a physics-guided deep unfolding framework that enables practical, real-time cross-band channel prediction for AI-RAN by embedding wireless channel physics, significantly improving beamforming gain while maintaining high inference speed.
SafeSteer proposes a localized on-policy distillation method that restricts safety alignment to specific safety tokens, thereby achieving strong safety performance with minimal degradation to general capabilities and significantly reducing data requirements.
The paper introduces ToolFG, a novel tool-integrated MLLM framework that enhances fine-grained image classification by enabling models to autonomously use external tools to gather verifiable visual cues.
The paper proposes a novel nonparametric mutual information estimator to robustly quantify dependence between heterogeneous temporal data, specifically continuous time series and discrete event sequences.
The paper introduces the concept of Search-Time Contamination (STC), demonstrating that deep research agents can leak information from public benchmarks via web search, leading to an overestimation of their true reasoning ability.
Papers
Search-Time Contamination in Deep Research Agents: Measuring Performance Inflation in Public Benchmark Evaluation
Yongjie Wang, Xinyue Zhang, Kunhong Yao, Zhiwei Zeng +3 more
The paper introduces the concept of Search-Time Contamination (STC), demonstrating that deep research agents can leak information from public benchmarks via web search, leading to an overestimation of…