Tong Zhang
8 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces the PrivacyIceberg framework to systematically categorize and empirically demonstrate the high risk of automated, deep personal profiling using LLM agents, revealing a significant gap between public concern and platform safeguards.
PRO-CUA introduces a process-reward optimization framework that enables efficient, step-level reinforcement learning for training computer use agents by decoupling environment interaction from policy optimization.
The paper introduces Q-ALIGN DT, a novel framework that improves conditioned sequence models by enforcing alignment between the input return-to-go (RTG) signal and the output policy's expected Q-value, leading to superior policy controllability and performance.
LARK introduces a novel learnability-grounded approach for selecting reasoning trajectories, significantly improving the efficiency of reasoning distillation by prioritizing trajectories that the student model can learn from.
Qwen-VLA introduces a unified embodied foundation model that extends vision-language understanding to continuous action generation, enabling robust, multi-task generalization across diverse robotic tasks and embodiments.
The paper proposes FLAME, a novel framework that detects AI-generated image forgeries by identifying intrinsic energy anomalies caused by the diffusion process, achieving state-of-the-art localization.
The paper introduces OpenWebRL, an open framework that enables training visual web agents using online multi-turn Reinforcement Learning directly on live websites, achieving state-of-the-art performance on challenging web benchmarks.
The paper introduces RedEdit, an agentic red-teaming framework that demonstrates that malicious images can be easily edited to bypass safety classifiers while retaining their harmful semantics.
Papers
RedEdit: Agentic Red-Teaming of Image Safety Classifiers via MCTS-Guided Photo-Editing
Weilin Lin, Ziqi Lin, Zhenxing Zhou, Jianze Li +3 more
The paper introduces RedEdit, an agentic red-teaming framework that demonstrates that malicious images can be easily edited to bypass safety classifiers while retaining their harmful semantics.