Papers similar to 2605.28007

~ similar to 2605.28007· 19 results

cs.CVcs.LGRecentJun 1, 2026

Disentanglement-Based Equivariant Learning for Compositional VQA

Zhou Du, Zhaoquan Yuan, Xiao Wu, Changsheng Xu

The paper proposes a novel Disentanglement-based Equivariant Learning (DEAL) framework that enhances compositional VQA by disentangling concepts and enforcing equivariant constraints, achieving state-…

View →

cs.LGcs.AIRecentMay 28, 2026

Forget Less, Generalize More: Unifying Temporal and Structural Adaptation for Dynamic Graphs

Qian Chang, Ciprian Doru Giurcaneanu, Runsong Jia, Xia Li +5 more

The paper proposes Dual-Scale Retentive Dynamics (DSRD), a unified framework that improves representation learning on dynamic graphs by jointly modeling evolving temporal and structural dependencies.

View →

q-bio.NCcs.LGRecentJun 1, 2026

How Optimality Structures Sparse Dictionaries: A Theory for Understanding SAE Representations

William Dorrell

The paper theoretically analyzes the properties that optimal sparse autoencoder (SAE) dictionaries must satisfy, deriving constraints that explain observed SAE behaviors like hierarchical splitting an…

View →

cs.LGRecentJun 1, 2026

Massive Spikes in LLMs are Bias Vectors: Mechanistic Uncovering and Spike-Free Quantization

Yung-Chin Chen, Chung Peng Lee, Ze-Wei Liou, Naveen Verma

The paper argues that large activation spikes in LLMs are structural vector biases, and proposes a novel quantization framework (INSERTQUANT) to eliminate these spikes, enabling robust low-bit quantiz…

View →

cs.LGcs.CLRecentMay 30, 2026

Task Structure Reverses Layerwise State Encoding in Sequence Models

Yuhang Jiang

The paper demonstrates that the location and nature of state encoding in sequence models are not fixed architectural traits but are highly dependent on the specific task, showing that the encoding pro…

View →

cs.LGcs.AIRecentMay 28, 2026

Composing Non-Conjugate Factor Graphs with Closed-Form Variational Inference

Mykola Lukashchuk, Kyrylo Yemets, Wouter M. Kouw, Dmitry Bagaev +3 more

The paper introduces a framework for composing deep probabilistic models using five specific factor-graph primitives that guarantee closed-form variational inference, thereby preserving tractability i…

View →

cs.CLRecentMay 29, 2026

The Latin Substrate: How Language Models Represent and Mediate Script Choice

Daniil Gurgurov, Alan Saji, Katharina Trinley, Josef van Genabith +1 more

This paper investigates how LLMs handle multiple writing systems, finding that while they use shared latent representations, the model exhibits a structural bias that makes generating Latin script eas…

View →

cs.AIRecentMay 28, 2026

Quantifying and Optimizing Simplicity via Polynomial Representations

Tianren Zhang, Xiangxin Li, Minghao Xiao, Guanyu Chen +1 more

The paper introduces polynomial representations as a quantitative, distribution-aware metric for measuring model simplicity, demonstrating that the effective degree of this representation is a superio…

View →

cs.AIcs.CLRecentMay 28, 2026

Locally Coherent, Globally Incoherent: Bounding Compositional Incoherence in Multi-Component LLM Agents

Anany Kotawala

The paper introduces a metric, the compositional residual eps*, to quantify how multi-component LLM agents violate basic probability axioms when combining local, coherent claims into a global predicti…

View →

cs.LGmath.STstat.MERecentJun 1, 2026

Network Learning with Semi-relaxed Gromov-Wasserstein

Charles Dufour, Ulysse Naepels, Leonardo V. Santoro

The paper proposes a semi-relaxed Gromov-Wasserstein objective to estimate the latent connectivity structure of large-scale networks, achieving statistically consistent and efficient recovery of the u…

View →

cs.LGcs.AIcs.CVRecentMay 31, 2026

BRo-JEPA: Learning Modular Arithmetic in Latent Space

Divyansh Jha, Yuanfang Xie, Varan Mehra, Brennen Yu

The paper introduces BRo-JEPA, a latent world model that successfully learns modular arithmetic (like addition modulo 10) by explicitly imposing the circular structure of the problem into the latent s…

View →

cs.AIcs.DBcs.IRRecentMay 29, 2026

Vector Linking via Cross-Model Local Isometric Consistency

Ziying Chen, Yang Cao, He Sun, Beining Yang +1 more

The paper proposes a novel geometric embedding hashing method to recover object correspondences (vector links) between two embedding clouds generated by different black-box encoders using only a small…

View →

cs.CVcs.AIRecentMay 28, 2026

VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion

Hidir Yesiltepe, Jiazhen Hu, Tuna Han Salih Meral, Adil Kaan Akan +3 more

VideoMLA introduces a novel Multi-Head Latent Attention (MLA) mechanism that replaces per-head KV caches with a shared low-rank content latent, significantly reducing memory and improving throughput f…

View →

cs.LGcs.AIRecentMay 29, 2026

Learning Cardiac Latent Representations in Vectorcardiogram Space

Bosong Huang, Panzhen Zhao, Zengxiang Li, Patricia Lee +4 more

This paper introduces LVCG, a novel self-supervised framework that learns unified, view-invariant latent representations of cardiac electrical activity directly in the physically grounded Vectorcardio…

View →

cs.AIRecentMay 27, 2026

Clark Hash: Stateless Sparse Johnson-Lindenstrauss Quantization for Neural Embeddings

Stanislav Kirdey, Clark Labs Inc

Clark Hash is a stateless, deterministic quantization method that significantly reduces the storage size of neural embeddings while maintaining high accuracy for cosine similarity search.

View →

cond-mat.mtrl-scics.CEcs.CLRecentMay 29, 2026

A Padding Method for Enhanced Encoding of Inorganic Structures with Varying Chemical Compositions

Thang Dang, Haderbache Amir, Tzanakakis Alexandros, Yoshimoto Yuta

The paper introduces a novel padding method that leverages crystal symmetry to enhance the encoding of complex inorganic structures, significantly improving the generation of stable, novel materials.

View →

cs.LGcs.AIcs.NERecentMay 27, 2026

BIRDNet: Mining and Encoding Boolean Implication Knowledge Graphs as Interpretable Deep Neural Networks

Tirtharaj Dash

BIRDNet is a novel, sparse, and interpretable deep neural network that encodes Boolean implication knowledge mined directly from tabular data, achieving performance comparable to dense models while dr…

View →

cs.LGcs.AIRecentMay 29, 2026

Positional versus Symbolic Attention Heads: Learning Dynamics, RoPE Geometry, and Length Generalization

Felipe Urrutia, Juan José Alegría, Cinthia Sanchez Macias, Jorge Salas +2 more

The paper analyzes the distinct computational roles of positional versus symbolic attention heads in Transformers, demonstrating that symbolic mechanisms generalize more reliably to longer sequences t…

View →

cs.CLcs.AIRecentJun 1, 2026

From Layers to Submodules: Rethinking Granularity in Replacement-Based LLM Compression

Elia Cunegatti, Marcus Vukojevic, Erik Nielsen, Giovanni Iacca

The paper proposes SubFit, a novel compression technique that achieves superior LLM compression by replacing non-contiguous, submodule-level components (Attention and FeedForward) with lightweight res…

View →