ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

20 results for “quantization”

CS papers only

Hybrid search: Keyword + semantic, ranked by combined score.ⓘ

Want pure semantic search? Try claim verification →

cs.CRcs.LGRecentApr 29, 2026

Quantamination: Dynamic Quantization Leaks Your Data Across the Batch

Hanna Foerster, Ilia Shumailov, Cheng Zhang, Yiren Zhao +2 more

This paper identifies a critical privacy vulnerability, termed Quantamination, where dynamic quantization in popular ML frameworks can leak sensitive user data across batch boundaries.

View →
cs.CLRecentMay 31, 2026

When Is 0.1% Enough? Analyzing the Combined Effects of Dimensionality Reduction and Quantization on Text Embedding Compression

Riku Kisako, Hayato Tsukagoshi, Ryohei Sasano

This paper systematically analyzes combining dimensionality reduction and quantization to compress text embeddings, showing that this combined approach achieves substantial compression (e.g., 0.1% siz…

View →
cs.LGcs.AIRecentMay 28, 2026

HARP: Hadamard-Preconditioned Adaptive Rotation Processor for Extreme LLM Quantization

Artur Zagitov, Gleb Molodtsov, Aleksandr Beznosikov

HARP introduces a novel, adaptive, learnable orthogonal processor that significantly improves the robustness and accuracy of extreme low-bit LLM quantization compared to fixed methods.

View →
cs.LGRecentJun 1, 2026

Massive Spikes in LLMs are Bias Vectors: Mechanistic Uncovering and Spike-Free Quantization

Yung-Chin Chen, Chung Peng Lee, Ze-Wei Liou, Naveen Verma

The paper argues that large activation spikes in LLMs are structural vector biases, and proposes a novel quantization framework (INSERTQUANT) to eliminate these spikes, enabling robust low-bit quantiz…

View →
cs.ARcs.ETRecentJun 4, 2026

FQA: A Full-Space Quantization-Driven Architecture for Hardware-Efficient Piecewise Approximation of Nonlinear Activation Functions

Chenjun Hao, Feng Yan, Hongbing Pan, Yuxuan Wang

This paper introduces a novel full-space quantization-driven architecture (FQA) to create highly efficient and accurate hardware approximations of nonlinear activation functions using piecewise polyno…

View →
cs.AIRecentMay 28, 2026

LFQ: Logit-aware Final-block Quantization for Boosting the Generation Quality of Low-Bit Quantized LLMs

Jung Hyun Lee, June Yong Yang, Jungwook Choi, Eunho Yang

The paper introduces Logit-aware Final-block Quantization (LFQ), an enhancement to block-wise quantization that quantizes the final Transformer block using a cross-entropy loss to significantly boost…

View →
cs.CVcs.AIRecentMay 30, 2026

Collaborative Few-Step Distillation and Low-Bit Quantization for Wan2.2 Dual-Expert Video Diffusion Models

Jinyang Du, Shenghao Jin, Ziqian Xu, Ruihao Gong +4 more

The paper proposes a compression pipeline combining few-step distillation and low-bit quantization to significantly reduce the deployment cost and parameter footprint of large dual-expert video diffus…

View →
cs.AIcs.LGRecentJun 1, 2026

Extreme Low-Bit Inference in Reasoning Models: Failure Modes and Targeted Recovery

Ekaterina Alimaskina, Darya Rudas, Denis Shveykin, Gleb Molodtsov +2 more

The paper analyzes the failure modes of aggressive 2-bit quantization in large reasoning models, proposing lightweight controls like FP16 planning and loop rescue to restore accuracy and achieve pract…

View →
cs.CRRecentApr 20, 2026

Privacy-Preserving Product-Quantized Approximate Nearest Neighbor Search Framework for Large-scale Datasets via A Hybrid of Fully Homomorphic Encryption and Trusted Execution Environment

Shozo Saeki, Minoru Kawahara, Hirohisa Aman

The paper proposes a Privacy-Preserving Product-Quantization Approximate Nearest Neighbor (PPPQ-ANN) framework that achieves practical performance and strong privacy guarantees for large-scale nearest…

View →
cs.CVcs.AIcs.LGRecentMay 27, 2026

Do We Really Need Quantum Machine Learning?: A Multidimensional Empirical Study

Sudip Vhaduri, Ryan Gammon, Sayanton Dibbo

This study empirically benchmarks classical and quantum machine learning models for image recognition, finding that while quantum models offer superior accuracy and resource efficiency at high dimensi…

View →
quant-phcs.AIRecentMay 31, 2026

Quantum Algorithm for Distributed Reduction of Entanglements (QADR): A Trainable and Simulation-Efficient QML Framework

Syed Farhan Ahmad, Gregory T. Byrd

The paper introduces QADR, a novel hybrid quantum-classical framework that efficiently trains variational quantum circuits by localizing entanglement reduction, thereby overcoming the exponential memo…

View →
cs.LGcs.AIRecentMay 28, 2026

Automatically Differentiable Nonlinear Tensor Networks (ADNTNs) for Exponential Compression of Deep Neural Networks

Andrzej Cichocki, Michal Wietczak

The paper introduces Automatically Differentiable Nonlinear Tensor Networks (ADNTNs) to achieve massive, structured compression of deep neural networks, demonstrating compression ratios up to 77,000x…

View →
cs.AIRecentMay 27, 2026

Clark Hash: Stateless Sparse Johnson-Lindenstrauss Quantization for Neural Embeddings

Stanislav Kirdey, Clark Labs Inc

Clark Hash is a stateless, deterministic quantization method that significantly reduces the storage size of neural embeddings while maintaining high accuracy for cosine similarity search.

View →
quant-phcs.CGmath.ATRecentMay 27, 2026

Quantum encodings that preserve persistent homology

Arthur J. Parzygnat, Andrew Vlasic

The paper investigates which quantum encodings can be applied directly to classical data point clouds while preserving the topological invariants necessary for topological data analysis (TDA).

View →
cs.CRRecentMay 3, 2026

Contrastive Privacy: A Semantic Approach to Measuring Privacy of AI-based Sanitization

George Bissias, Eugene Bagdasarian, Brian Neil Levine

The paper introduces 'contrastive privacy,' a formal, model-agnostic, and quantitative method for evaluating the semantic success of AI-based sanitization across multiple media modalities.

View →
quant-phcs.AIcs.CRRecentApr 29, 2026

Quantum Gatekeeper: Multi-Factor Context-Bound Image Steganography with VQC Based Key Derivation on Quantum Hardware

Sahil Tomar, Sandeep Kumar

Quantum Gatekeeper is a robust, multi-factor context-bound image steganography framework that embeds payloads using LSB and derives a gate key from a Variational Quantum Circuit (VQC), ensuring recove…

View →
cs.LGcs.AIstat.MLRecentMay 30, 2026

Quantum Tunneling-Aware Machine Learning: Physics-Derived Noise Models for Robust Deployment

Uiwon Hwang, Jaeho Hwang

The paper introduces Quantum Tunneling-Aware Machine Learning (QTAML) and a compensation algorithm (TAC) that accurately models and compensates for quantum tunneling errors in AI inference, achieving…

View →
cond-mat.dis-nnquant-phstat.MLRecentJun 4, 2026

Nonreversible Gauge Fields in Fokker--Planck Dynamics: Supersymmetric Hamiltonians and Learned Finite Forces

Masayuki Ohzeki

The paper reformulates nonreversible perturbations of Fokker--Planck dynamics as gauge fields, providing a unified operator viewpoint to analyze relaxation processes and develop methods for learning o…

View →
cs.LGcs.AIRecentMay 29, 2026

Geometric Erasure by Contrastive Velocity Matching in Rectified Flows

Jonas Henry Grebe, Tobias Braun, Anna Rohrbach, Marcus Rohrbach

The paper introduces GEM, an effective concept erasure framework for Rectified Flow Transformers, by unifying trajectory-based unlearning with classic teacher-guided flow matching.

View →