Papers similar to 2605.30022

~ similar to 2605.30022· 16 results

cs.LGcs.CCRecentJun 1, 2026

Rethinking the Role of Positional Encoding: Sliding-Window Transformers without PE Remain Turing Complete

The paper demonstrates that positional encodings are not necessary for transformers to achieve universal computation, showing that the inherent mechanism of sliding context windows already provides su…

View →

cs.LGcs.CLeess.SPRecentMay 31, 2026

Beyond Sinusoids: A Morlet Wavelet Framework for Transformer Positional Encoding

Athanasios Zeris

The paper introduces Morlet Positional Encoding (MoPE), a novel wavelet-based positional encoding that models position and locality simultaneously, outperforming standard sinusoidal and RoPE methods.

View →

cs.LGcs.AIRecentMay 29, 2026

Positional versus Symbolic Attention Heads: Learning Dynamics, RoPE Geometry, and Length Generalization

Felipe Urrutia, Juan José Alegría, Cinthia Sanchez Macias, Jorge Salas +2 more

The paper analyzes the distinct computational roles of positional versus symbolic attention heads in Transformers, demonstrating that symbolic mechanisms generalize more reliably to longer sequences t…

View →

cs.CLcs.AIRecentMay 27, 2026

Periodic RoPE for Infinite Context LLMs

Simin Huo

The paper proposes Periodic RoPE (P-RoPE) combined with a dual-layer attention mechanism to overcome the positional encoding limitations of LLMs, enabling theoretically infinite context understanding.

View →

cs.CVcs.AIcs.LGRecentMay 29, 2026

Envisioning Beyond the Few: Disentangled Semantics and Primitives for Few-Shot Atypical Layout-to-Image Generation

Nan Bao, Yifan Zhao, Wenzhuang Wang, Jia Li

The paper proposes a disentangled representation framework to significantly improve few-shot layout-to-image generation by separating semantic identity from local visual details, thereby mitigating re…

View →

cs.CLcs.AIRecentMay 27, 2026

The Attentional White Bear Effect in Transformer Language Models

Rebecca Ramnauth, Brian Scassellati

The paper demonstrates that content suppression techniques used in language models only mask prohibited content at the output level, failing to eliminate the underlying concepts from the model's inter…

View →

cs.AIRecentMay 28, 2026

Benchmarking Positional Encoding Strategies for Transformer-Based EEG Foundation Models

Ayse Betul Yuce, Sebastian Stober

This paper benchmarks five positional encoding strategies for transformer-based EEG foundation models, concluding that the optimal encoding is task-dependent and no single strategy is universally supe…

View →

cs.LGcs.AIRecentMay 27, 2026

Locality-Aware Redundancy Pruning for LLM Depth Compression

Vincent-Daniel Yun, Youngrae Kim, Woosang Lim, YoungJin Heo +2 more

The paper proposes Locality-Aware Redundancy Pruning (LoRP), a training-free method that prunes LLM layers by exploiting localized inter-layer redundancy, leading to improved efficiency while maintain…

View →

cs.LGcs.AIRecentMay 27, 2026

Semantic Optimal Transport for Sparse Autoencoder Feature Matching and Circuit Compression

Tue M. Cao, Nguyen Do, My T. Thai

The paper introduces a distributional framework using Wasserstein distance to unify the semantic comparison of sparse autoencoder features across different layers and to automatically compress large f…

View →

cs.IRcs.AIcs.CLRecentMay 29, 2026

On the impact of retrieved content representations in RAG Pipelines

Jonathan J Ross, Bevan Koopman, Anton van der Vegt, Guido Zuccon

The paper systematically compares multiple content representations for RAG pipelines and finds that answer retention—the ability of the representation to preserve the original answer-bearing content—i…

View →

cs.LGcs.CLRecentMay 30, 2026

Task Structure Reverses Layerwise State Encoding in Sequence Models

Yuhang Jiang

The paper demonstrates that the location and nature of state encoding in sequence models are not fixed architectural traits but are highly dependent on the specific task, showing that the encoding pro…

View →

cs.SDcs.AIcs.IRRecentMay 29, 2026

Latent Space Disentanglement via Activation Steering for Interpretable Attribute Control in Symbolic Music Generation

Ioannis Prokopiou, Pantelis Vikatos, Maximos Kaliakatsos-Papakostas, Theodoros Giannakopoulos +1 more

The paper proposes an inference-time activation steering framework, utilizing orthogonalization, to achieve fine-grained, deterministic control over discrete musical attributes like Pitch and Duration…

View →

cs.AIRecentMay 31, 2026

Emergent Ordinal Geometry in Transformers Trained on Local Comparisons

Nishit Singh

The paper demonstrates that Transformers trained on local comparisons implicitly learn a global, one-dimensional ordinal structure, mirroring the human ability to perform transitive inference.

View →

cs.LGRecentJun 1, 2026

Massive Spikes in LLMs are Bias Vectors: Mechanistic Uncovering and Spike-Free Quantization

Yung-Chin Chen, Chung Peng Lee, Ze-Wei Liou, Naveen Verma

The paper argues that large activation spikes in LLMs are structural vector biases, and proposes a novel quantization framework (INSERTQUANT) to eliminate these spikes, enabling robust low-bit quantiz…

View →

cs.CLcs.LGRecentJun 1, 2026

Resonant Context Anchoring: Decoupling Attention Routing and Signal Gain at Inference Time

Mingkuan Zhao, Yide Gao, Wentao Hu, Suquan Chen +5 more

The paper proposes Resonant Context Anchoring (RCA), a lightweight, training-free method that enhances factual faithfulness in LLMs by dynamically amplifying the signal of external context evidence du…

View →

cs.CLcs.AIRecentMay 29, 2026

What Gets Unmasked First? Trajectory Analysis of Diffusion Models for Graph-to-Text Generation

Qing Wang, Jacob Devasier, Chengkai Li

This paper analyzes the decoding process of masked diffusion models for graph-to-text generation, finding that structural fine-tuning disrupts natural entity-first generation and proposing a structura…

View →