The paper introduces SCALR, a novel framework that generates synthetic user-item interaction data from a source domain to augment a target recommendation domain, significantly improving system performance in A/B tests.
Large-scale recommendation systems operate across diverse domains, yet they face the challenges of data sparsity and noisy implicit feedback. Traditional approaches mitigate this via model-specific knowledge distillation from source domains to a target domain. Inspired by the transformative success of synthetic data generation in large language models (LLMs), we introduce Synthetic Cross-domain Augmentation and Learning for Recommendation (SCALR), a framework that generates synthetic user-item interaction events for a target recommendation domain by leveraging observed events from a source domain. SCALR decomposes cross-domain learning into two modular stages. First, it translates observed user events in source domains by framing event generation as estimating the likelihood that a user would interact with a target-domain item, conditioned on their observed interactions in a source domain. Second, downstream models train on these synthetic events as cross-domain learning objectives, where the synthetic events augment the target domain's training data in a model-agnostic manner. Our approach yields statistically significant improvements in online A/B tests on an industrial recommendation platform. To the best of our knowledge, this is among the first works to explicitly frame cross-domain event transfer as synthetic data generation for recommendation systems.
Breaking the Information Silo: Semantic Personas for Cross-Domain Recommendation
The paper proposes SPHERE, a novel framework that uses large language models to…
Domain-Specific Data Synthesis for LLMs via Minimal Sufficient Representation Learning
The paper introduces DOMINO, a novel inductive framework that synthesizes domain…
A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL
The paper proposes a local perturbation theory showing that cross-domain interfe…
Multimodal Music Recommendation System using LLMs
The paper proposes a novel multimodal framework for session-based music recommen…
Toward User Preference Alignment in LLM Recommendation via Explicit Context Feedback
The paper advocates for integrating explicit contextual feedback (like reviews a…
REED: Post-Training Representation Editing for Cross-Domain Linguistic Steganalysis
The paper proposes REED, a post-training representation editing method that sign…
SafeRx-Agent: A Knowledge-Grounded Multi-Agent Framework for Safe and Explainable Medication Recomme…
The paper introduces SafeRx-Agent, a knowledge-grounded multi-agent framework th…
Source-Grounded Semantic Reinforcement Learning for Low-Resource Target-Language Generation
The paper introduces Source-Grounded Semantic Reinforcement Learning (SG-SRL), a…