ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

20 results for “auto-encoder”

CS papers only

Hybrid search: Keyword + semantic, ranked by combined score.ⓘ

Want pure semantic search? Try claim verification →

cs.LGRecentJun 3, 2026

BBOmix: A Tabular Benchmark for Hyperparameter Optimization of Unsupervised Biological Representation Learning

Luca Thale-Bombien, Jan Ewald, Ralf König, Aaron Klein

This paper introduces BBOmix, an open-source benchmark for unsupervised representation learning on real-world biological data.

View →
cs.LGcs.AIRecentMay 27, 2026

ReSAE: Residualized Sparse Autoencoders for Multi-Layer Transformer Interventions

Prathyush Poduval, Calvin Yeung, Neel Desai, Mohsen Imani

The paper introduces Residualized Sparse Autoencoders (ReSAEs) to improve multi-layer interventions in transformers by training each layer on the residual activation, which better preserves cross-laye…

View →
cs.LGcs.AIRecentMay 27, 2026

Semantic Optimal Transport for Sparse Autoencoder Feature Matching and Circuit Compression

Tue M. Cao, Nguyen Do, My T. Thai

The paper introduces a distributional framework using Wasserstein distance to unify the semantic comparison of sparse autoencoder features across different layers and to automatically compress large f…

View →
cs.CLcs.AIRecentMay 28, 2026

Give it Space! Explicit Disentangling of Positional and Semantic Representations in Encoders

Pierre-Antoine Lequeu, Camille Barboule, Benjamin Piwowarski

The paper proposes explicitly disentangling positional and semantic representations in Transformer encoders, demonstrating that this separation allows for a clearer understanding of how positional inf…

View →
q-bio.NCcs.LGRecentJun 1, 2026

How Optimality Structures Sparse Dictionaries: A Theory for Understanding SAE Representations

William Dorrell

The paper theoretically analyzes the properties that optimal sparse autoencoder (SAE) dictionaries must satisfy, deriving constraints that explain observed SAE behaviors like hierarchical splitting an…

View →
eess.AScs.AIcs.SDRecentMay 27, 2026

LoSATok: Low-dimensional Semantic-Acoustic Tokenizer for Cross-Domain Audio Understanding and Generation

Zhisheng Zhang, Xiang Li, Yixuan Zhou, Jing Peng +2 more

LoSATok proposes a low-dimensional semantic-acoustic tokenizer that efficiently compresses high-dimensional audio features into a compact latent space, significantly improving the performance and effi…

View →
cs.AIRecentMay 28, 2026

Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet

Adly Templeton, Tom Conerly, Jonathan Marcus, Jack Lindsey +22 more

The paper demonstrates that sparse autoencoders can successfully extract a large set of interpretable, causally influential features from the production-scale Claude 3 Sonnet language model.

View →
cs.CRcs.AIcs.MMRecentMar 31, 2026

Mean Masked Autoencoder with Flow-Mixing for Encrypted Traffic Classification

Xiao Liu, Xiaowei Fu, Fuxiang Huang, Lei Zhang

The paper proposes Mean MAE (MMAE), a novel self-supervised pre-training framework that uses flow mixing and teacher-student distillation to improve encrypted traffic classification by capturing multi…

View →
cs.AIcs.CRRecentMay 27, 2026

Refusal Before Decoding: Detecting and Exploiting Refusal Signals in Intermediate LLM Activations

Matteo Gioele Collu, Riccardo Conte, Alberto Giaretta, Denis Kleyko +3 more

The paper demonstrates that refusal behavior in Large Language Models (LLMs) is encoded as an actionable, linearly decodable signal in intermediate transformer activations, allowing for early detectio…

View →
cs.AIcs.CRRecentMay 27, 2026

Refusal Before Decoding: Detecting and Exploiting Refusal Signals in Intermediate LLM Activations

Matteo Gioele Collu, Riccardo Conte, Alberto Giaretta, Denis Kleyko +3 more

The paper demonstrates that refusal behavior in Large Language Models (LLMs) is encoded as an actionable, linearly decodable signal in intermediate transformer activations, allowing for early detectio…

View →
cs.CLcs.AIcs.LGRecentMay 29, 2026

Not All Synthetic Data Is Yours to Learn From

Sina Alemohammad, Li Chen, Richard G. Baraniuk, Zhangyang Wang

Weak self-training on synthetic data can amplify a language model's existing capabilities, but this effect is strictly dependent on the compatibility between the source and student models, not on the…

View →
cs.CLcs.AIcs.LGRecentMay 29, 2026

Steering LLMs? Actually, Sparse Autoencoders can outperform simple baselines

Mikkel Godsk Jørgensen, Lars Kai Hansen

This paper demonstrates that Sparse Autoencoders (SAEs) can effectively steer Large Language Models (LLMs) on the AxBench benchmark, achieving performance comparable to LoRA baselines when combined wi…

View →
cs.CRRecentMay 6, 2026

GLiNER Guard: Unified Encoder Family for Production LLM Safety and Privacy

Bogdan Minko, Sabrina Sadiekh, Evgeniy Kokuykin

GLiNER Guard (GLiGuard) introduces a unified, efficient encoder family that simultaneously performs safety classification and PII detection in a single forward pass, offering a practical, low-cost alt…

View →
cs.LGcs.AIRecentMay 30, 2026

Richer Representations for Neural Algorithmic Reasoning via Auxiliary Reconstruction

Jiafu Huang, Chao Peng, Chenyang Xu, Zhengfeng Yang +6 more

The paper proposes using an auxiliary reconstruction task, specifically one that captures intra-state feature dependencies, to improve the quality of state representations learned by the encoder in ne…

View →
cs.LGcs.AIRecentMay 28, 2026

Automatically Differentiable Nonlinear Tensor Networks (ADNTNs) for Exponential Compression of Deep Neural Networks

Andrzej Cichocki, Michal Wietczak

The paper introduces Automatically Differentiable Nonlinear Tensor Networks (ADNTNs) to achieve massive, structured compression of deep neural networks, demonstrating compression ratios up to 77,000x…

View →
cs.LGcs.AIcs.CRRecentMar 25, 2026

Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs

Alexander Panfilov, Peter Romov, Igor Shilov, Yves-Alexandre de Montjoye +2 more

The paper demonstrates that using advanced AI agents in an autoresearch loop can discover novel and highly effective adversarial attack algorithms, significantly advancing the state-of-the-art for jai…

View →
cs.LGcs.CLeess.SPRecentMay 31, 2026

Beyond Sinusoids: A Morlet Wavelet Framework for Transformer Positional Encoding

Athanasios Zeris

The paper introduces Morlet Positional Encoding (MoPE), a novel wavelet-based positional encoding that models position and locality simultaneously, outperforming standard sinusoidal and RoPE methods.

View →
cs.CLRecentMay 29, 2026

How Far Do Auto-Interpretation Labels Generalize: A Controlled Study Across Languages, Scripts, and Rewordings

Sripad Karne

The study investigates the generalization of auto-generated natural-language labels for language model features, finding that while the underlying features show cross-lingual semantic consistency, the…

View →
cs.SDcs.AIcs.CRRecentJun 4, 2026

Beyond Waveform Robustness: Robust Feature-Vocoder Adversarial Attacks on Automatic Speech Recognition

Yifan Liao, Zongmin Zhang, Zhen Sun, Yuhui Sun +2 more

The paper introduces a novel Clean-Referenced Feature-Vocoder Attack, a black-box adversarial attack that perturbs high-level SSL feature representations instead of raw audio waveforms, achieving supe…

View →
cs.CVcs.AIcs.CLRecentJun 1, 2026

AdaCodec: A Predictive Visual Code for Video MLLMs

Haowen Hou, Zhen Huang, Zheming Liang, Qingyi Si +7 more

AdaCodec introduces a predictive visual coding scheme for video MLLMs, significantly improving efficiency and performance by transmitting only inter-frame changes and full reference frames when necess…

View →