Yu Chen
37 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The study compares agentic data retrieval using unstructured web data versus structured, semantically-annotated datasets, concluding that semantic metadata remains essential for high-precision, reliable, and execution-oriented data discovery.
PromptEmbedder introduces a dual-LLM framework that efficiently and transferably adapts text embeddings by decoupling task-specific knowledge from the backbone model, significantly reducing computational overhead compared to methods like LoRA.
C-MIG is a novel retrieval-augmented generation framework that uses multi-view information gain to improve clinical diagnosis reasoning by providing richer, more nuanced reward signals than existing methods.
BORA is an offline-to-online RL framework that enhances dexterous VLA models for real-world robotics by using an action-conditioned critic and a lightweight residual adaptation mechanism to correct execution errors.
The paper introduces MemPoison, a novel memory poisoning attack that successfully injects triggerable backdoors into LLM agents' long-term memory through conversational interactions, achieving high attack success rates by bypassing selective memory mechanisms.
The paper introduces polynomial representations as a quantitative, distribution-aware metric for measuring model simplicity, demonstrating that the effective degree of this representation is a superior predictor of generalization compared to existing proxies.
The paper proposes MemPoison, a novel memory poisoning attack that injects triggerable backdoors into LLM agents' long-term memory through dialogue interactions, achieving high success rates by bypassing selective memory mechanisms.
The paper introduces SCALR, a novel framework that generates synthetic user-item interaction data from a source domain to augment a target recommendation domain, significantly improving system performance in A/B tests.
The paper introduces SURE, a unified framework designed to standardize and improve the comparability and reproducibility of evaluations for advanced speech understanding models.
MixFP4 introduces a mixed micro-format extension to NVFP4, allowing blocks to dynamically select between two stored FP4 formats (E2M1 and E1M2) to improve quantization accuracy without altering the standard hardware execution path.
The paper proposes a graph-constrained approach to scale multi-hop training data by decoupling path discovery from path verbalization, significantly expanding the usable corpus size for LLMs.
The paper introduces PROBE, an optimization framework that guides LLM agents in structure-based drug design by performing controlled 'probe edits' to assess how molecular changes affect both binding affinity and druggability simultaneously.
TAPS introduces a target-aware prefix selection method that optimizes the trade-off between draft tree acceptance and verification cost, achieving significant speedups in speculative decoding.
The paper deconstructs latent visual reasoning tokens into components and finds that the performance gains are primarily due to boundary markers and attention patterns, not the tokens' ability to encode visual evidence.
SIRI introduces a self-internalizing reinforcement learning framework that allows LLM agents to autonomously discover and integrate reusable skills directly into their core policy, significantly improving performance on complex tasks without external skill generators.
The paper introduces MetricScenes, a new large-scale, in-the-wild dataset, and demonstrates that fine-tuning existing geometry models on this dataset significantly mitigates the scale-collapse problem in metric scale monocular geometry estimation.
The paper argues that observed gains in multimodal agents using tools may be due to learning tool-calling patterns rather than genuine capability expansion, finding that tool access provides little consistent aggregate improvement.
The paper proposes an objective-wise reputation-market mechanism to dynamically calibrate and gate LLM-generated expert priors in multi-objective Bayesian optimization, showing that dynamic calibration improves robustness over fixed priors.
JenBridge is a novel, adaptive framework that generates high-fidelity, long-form video soundtracks, significantly improving narrative coherence and naturalness across scene transitions.
The paper proposes Skill-RM, a unified framework that treats reward modeling as an agentic task to consistently integrate diverse evaluation criteria, achieving superior performance over traditional methods.
Papers
Skill-RM: Unifying Heterogeneous Evaluation Criteria via Agent Skill
Tao Chen, Gangwei Jiang, Pengyu Cheng, Siyuan Huang +9 more
The paper proposes Skill-RM, a unified framework that treats reward modeling as an agentic task to consistently integrate diverse evaluation criteria, achieving superior performance over traditional m…