Zhuang Liu
4 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
PASTA proposes a novel, twofold stealthy backdoor attack that enables high-success-rate backdoor activation across arbitrary patches in Vision Transformers by leveraging the Trigger Radiating Effect (TRE).
DETOUR proposes a practical backdoor attack against object detection models by using semantic triggers that are robust to variations in size, location, and field of view (FoV), overcoming limitations of existing fixed-trigger attacks.
The paper proposes VLM3, a simple, scalable method that demonstrates standard Vision Language Models (VLMs) can natively learn 3D understanding by focusing on architectural simplicity and specific data techniques.
The paper introduces CityGen, a diffusion-based framework that enables zero-label city adaptation for autonomous driving by synthesizing city-style data conditioned on HD maps and visual prompts, significantly improving cross-city generalization.
Papers
VLM3: Vision Language Models Are Native 3D Learners
Zhipeng Cai, Zhuang Liu, Yunyang Xiong, Zechun Liu +2 more
The paper proposes VLM3, a simple, scalable method that demonstrates standard Vision Language Models (VLMs) can natively learn 3D understanding by focusing on architectural simplicity and specific dat…