~ similar to 2605.28007· 19 results
The paper proposes a novel Disentanglement-based Equivariant Learning (DEAL) framework that enhances compositional VQA by disentangling concepts and enforcing equivariant constraints, achieving state-…
Qian Chang, Ciprian Doru Giurcaneanu, Runsong Jia, Xia Li +5 more
The paper proposes Dual-Scale Retentive Dynamics (DSRD), a unified framework that improves representation learning on dynamic graphs by jointly modeling evolving temporal and structural dependencies.
The paper theoretically analyzes the properties that optimal sparse autoencoder (SAE) dictionaries must satisfy, deriving constraints that explain observed SAE behaviors like hierarchical splitting an…
The paper argues that large activation spikes in LLMs are structural vector biases, and proposes a novel quantization framework (INSERTQUANT) to eliminate these spikes, enabling robust low-bit quantiz…
The paper demonstrates that the location and nature of state encoding in sequence models are not fixed architectural traits but are highly dependent on the specific task, showing that the encoding pro…
The paper introduces a framework for composing deep probabilistic models using five specific factor-graph primitives that guarantee closed-form variational inference, thereby preserving tractability i…
This paper investigates how LLMs handle multiple writing systems, finding that while they use shared latent representations, the model exhibits a structural bias that makes generating Latin script eas…
Tianren Zhang, Xiangxin Li, Minghao Xiao, Guanyu Chen +1 more
The paper introduces polynomial representations as a quantitative, distribution-aware metric for measuring model simplicity, demonstrating that the effective degree of this representation is a superio…
The paper introduces a metric, the compositional residual eps*, to quantify how multi-component LLM agents violate basic probability axioms when combining local, coherent claims into a global predicti…
The paper proposes a semi-relaxed Gromov-Wasserstein objective to estimate the latent connectivity structure of large-scale networks, achieving statistically consistent and efficient recovery of the u…
The paper introduces BRo-JEPA, a latent world model that successfully learns modular arithmetic (like addition modulo 10) by explicitly imposing the circular structure of the problem into the latent s…
Ziying Chen, Yang Cao, He Sun, Beining Yang +1 more
The paper proposes a novel geometric embedding hashing method to recover object correspondences (vector links) between two embedding clouds generated by different black-box encoders using only a small…
VideoMLA introduces a novel Multi-Head Latent Attention (MLA) mechanism that replaces per-head KV caches with a shared low-rank content latent, significantly reducing memory and improving throughput f…
Bosong Huang, Panzhen Zhao, Zengxiang Li, Patricia Lee +4 more
This paper introduces LVCG, a novel self-supervised framework that learns unified, view-invariant latent representations of cardiac electrical activity directly in the physically grounded Vectorcardio…
Clark Hash is a stateless, deterministic quantization method that significantly reduces the storage size of neural embeddings while maintaining high accuracy for cosine similarity search.
The paper introduces a novel padding method that leverages crystal symmetry to enhance the encoding of complex inorganic structures, significantly improving the generation of stable, novel materials.
BIRDNet is a novel, sparse, and interpretable deep neural network that encodes Boolean implication knowledge mined directly from tabular data, achieving performance comparable to dense models while dr…
The paper analyzes the distinct computational roles of positional versus symbolic attention heads in Transformers, demonstrating that symbolic mechanisms generalize more reliably to longer sequences t…
The paper proposes SubFit, a novel compression technique that achieves superior LLM compression by replacing non-contiguous, submodule-level components (Attention and FeedForward) with lightweight res…