~ similar to 2605.29453· 19 results
The paper introduces Graph Cascades, a mesoscopic rewiring technique that enhances Graph Neural Networks by promoting node pairs with strong multi-hop connections to direct edges, improving performanc…
MARS proposes an encoder-agnostic aggregation operator that explicitly models multi-scale temporal structure in sequential recommendation, achieving state-of-the-art performance across both sparse and…
This paper proposes Supervised Memory Training (SMT), a method for training nonlinear RNNs that sidesteps recurrent credit propagation entirely.
This paper proposes Supervised Memory Training (SMT), a method for training nonlinear RNNs that sidesteps recurrent credit propagation entirely.
The paper introduces the Terminal Representation (TR), a novel, lower-dimensional, and structurally distinct formulation for encoding reward-weighted trajectories in RL that bypasses the need for eige…
The paper introduces the Vector Network (VN), a novel recurrent architecture that replaces fixed weight matrices with reusable weight atoms, enabling superior compositional generalization by making st…
Jizhan Fang, Buqiang Xu, Zhixian Wang, Haoliang Cao +11 more
The paper proposes FluxMem, a novel connectivity-evolving memory framework that models memory as a dynamic graph to improve LLM agent performance in complex, changing environments.
The paper proposes a semi-relaxed Gromov-Wasserstein objective to estimate the latent connectivity structure of large-scale networks, achieving statistically consistent and efficient recovery of the u…
Qiao Xiao, Boqian Wu, Patrik Okanovic, Tomasz Sternal +5 more
The paper introduces Sparse Memory-Efficient Training (SMET), a method that stabilizes and optimizes Dynamic Sparse Training (DST) for large language models, enabling stable and memory-efficient spars…
The paper introduces Complexity-Balanced Splitting (CBS), a framework that efficiently allocates model capacity across the diffusion timeline by focusing computational resources on the most complex ge…
The paper proposes a Doeblin-anchored contrastive chart to learn valid Markov transition kernels by combining the target transition with a restart law, ensuring the learned object is mathematically so…
The paper formally addresses the challenging question of cross-domain transferability of latent predictive models by proposing a structured framework that quantifies the relationship between source an…
Kesha Ou, Zhen Tian, Wayne Xin Zhao, Long Zhang +2 more
This paper proposes a novel framework, DS-MLP, for click-through rate prediction in online advertising and recommendation systems.
Zhikun Xu, Yu Feng, Jacob Dineen, Taiwei Shi +2 more
The paper proposes ReuseRL, a method that improves agent generalization in Reinforcement Learning by enforcing structural compressibility of successful agent trajectories into reusable skills.
Zhichen Tang, Zhengzheng Dang, Yulin Chen, Jixin Wu +2 more
EvoMD-LLM introduces a novel framework that models reactive molecular dynamics as a symbolic temporal language problem, enabling LLMs to accurately predict complex, time-evolving chemical processes.
The paper introduces 'layered mutability,' a framework for analyzing how persistent self-modifying AI agents drift away from intended behavior due to the accumulation of locally reasonable, uncoordina…
Zhi Zhou, Ming Yang, Shi-Yu Tian, Kun-Yang Yu +2 more
The paper establishes the first theoretical framework for analyzing the learnability of Test-Time Adaptation (TTA) under non-stationary data streams by introducing Recovery Complexity, which quantifie…
The paper proposes $D^3$, a dynamic graph-constrained scheduling framework that optimizes LLM training order by modeling sample interactions as a dynamic influence graph.
The paper proposes a unified framework for designing efficient and expressive token mixing layers by separating the direct and recurrent influences of inputs, allowing for a principled trade-off betwe…