Jie Chen
15 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
TEMPLATEFUZZ is a fine-grained fuzzing framework that systematically tests chat templates to find vulnerabilities in LLMs, achieving high jailbreak success rates with minimal performance degradation.
The paper proposes Cluster Segregation Concealment (CSC), a novel defense that identifies and neutralizes backdoor triggers by relabeling poisoned samples to a virtual class, achieving near-zero attack success rates with minimal accuracy loss.
This paper analyzes the failure of current embedding-based defenses in multi-agent LLM systems and proposes using token-level confidence scores (logits) for improved robustness.
MemMark introduces a state-evolution attribution watermark that embeds owner-controlled signals into latent memory-write decisions, enabling robust provenance tracking for agent memory even when all traditional logs and metadata are lost.
The paper proposes using GUI agents, both as objective evaluators and subjective playtesters, to significantly improve the generation of playable games from prompts, demonstrating a 66.8% rubric pass-rate with a novel iterative framework.
The paper proposes a novel multimodal framework for session-based music recommendation that jointly models audio, lyric, and semantic content signals within a unified LLM-based sequential reasoning system.
The paper introduces CROP, a novel conformal procedure that provides rigorous statistical guarantees for certifying the longest safe prefix of a language model's reasoning trace, allowing for targeted error identification and repair.
The paper introduces AnyMo, a unified multimodal framework that enables high-quality, scalable conditional human motion generation by leveraging a massive, cross-modal dataset and a masked modeling transformer.
The paper introduces DynSess, a novel session-level framework that evaluates and optimizes role-playing agents by assessing long-horizon conversational quality, significantly outperforming existing turn-level methods.
FlowTime proposes a novel Continuous Generative Regression framework using a Flow-based Personalized Prior to accurately model the multimodal and heterogeneous nature of user watch time prediction, significantly outperforming existing state-of-the-art methods.
The paper reframes Parameter-Efficient Fine-Tuning (PEFT) from a mere cost-saving alternative to a robust architecture for creating persistent, personalized models that layer specific behaviors onto large shared foundation models.
STaR-KV introduces a novel, training-free KV cache compression framework that adaptively re-weights token importance across spatial, temporal, and distributional axes, significantly reducing GPU memory usage for GUI vision-language models while maintaining high accuracy.
The paper introduces LongJudgeBench, a new benchmark designed to evaluate the reliability of LLM judges specifically for complex, long-form output evaluation, revealing significant instability gaps in current LLM judging methods.
The paper proposes OneReason, a framework that enhances the reasoning capability of generative recommendation models by focusing on improving item perception and structuring user behavior into coherent latent interests.
This paper proposes a post-training framework called Retrieval-Augmented Reinforcement Fine-Tuning (RA-RFT) to teach language models to reason by analogy.
Papers
Learning to Reason by Analogy via Retrieval-Augmented Reinforcement Fine-Tuning
Zilin Xiao, Qi Ma, Chun-cheng Jason Chen, Xintao Chen +3 more
This paper proposes a post-training framework called Retrieval-Augmented Reinforcement Fine-Tuning (RA-RFT) to teach language models to reason by analogy.