Papers similar to 2606.00130

~ similar to 2606.00130· 18 results

cs.LGcs.AIEmpiricalRecentJun 4, 2026

PC Layer: Polynomial Weight Preconditioning for Improving LLM Pre-Training

Senmiao Wang, Tiantian Fang, Haoran Zhang, Yushun Zhang +3 more

This paper proposes a preconditioning layer for stable weight conditioning in LLM training.

View →

cs.LGcs.AIEmpiricalRecentJun 4, 2026

PC Layer: Polynomial Weight Preconditioning for Improving LLM Pre-Training

Senmiao Wang, Tiantian Fang, Haoran Zhang, Yushun Zhang +3 more

This paper proposes a preconditioning layer for stable weight conditioning in LLM training.

View →

cs.CVcs.AIcs.LGRecentMay 27, 2026

Do We Really Need Quantum Machine Learning?: A Multidimensional Empirical Study

Sudip Vhaduri, Ryan Gammon, Sayanton Dibbo

This study empirically benchmarks classical and quantum machine learning models for image recognition, finding that while quantum models offer superior accuracy and resource efficiency at high dimensi…

View →

cs.LGcond-mat.dis-nncs.NERecentJun 2, 2026

Training a Predictive Coding Network on ImageNet using Equilibrium Propagation

Tugdual Kerjan, Rasmus Høier, Benjamin Scellier

The paper introduces an Equilibrium Propagation (EP)-based training method for Predictive Coding Networks (PCNs), successfully training a large-scale VGG10 model on ImageNet and achieving state-of-the…

View →

cs.LGcs.AIRecentMay 31, 2026

Neural Network Compression by Approximate Differential Equivalence

Ravi Dhiman, Andrea Passarella, Mirco Tribastone, Lorenzo Valerio

The paper proposes a novel neural network compression technique that aggregates neurons with similar functional dynamics, achieving significant model size reduction while maintaining high accuracy.

View →

cs.LGRecentJun 1, 2026

Expressivity of congruence-based architectures for DNNs on positive-definite matrices

Antonin Oswald, Estelle Massart

The paper analyzes congruence-based neural architectures for classifying positive-definite matrices, demonstrating that common semi-orthogonality constraints severely limit the model's expressivity.

View →

quant-phcs.AIRecentMay 31, 2026

Quantum Algorithm for Distributed Reduction of Entanglements (QADR): A Trainable and Simulation-Efficient QML Framework

Syed Farhan Ahmad, Gregory T. Byrd

The paper introduces QADR, a novel hybrid quantum-classical framework that efficiently trains variational quantum circuits by localizing entanglement reduction, thereby overcoming the exponential memo…

View →

cs.LGcs.AIRecentMay 27, 2026

Learning Compositional Latent Structure with Vector Networks

Niclas Pokel, Benjamin F. Grewe

The paper introduces the Vector Network (VN), a novel recurrent architecture that replaces fixed weight matrices with reusable weight atoms, enabling superior compositional generalization by making st…

View →

cs.CRcs.AIcs.CVRecentApr 13, 2026

QShield: Securing Neural Networks Against Adversarial Attacks using Quantum Circuits

Navid Azimi, Aditya Prakash, Yao Wang, Li Xiong

The paper proposes QShield, a hybrid quantum-classical neural network architecture, which significantly enhances the adversarial robustness of deep learning models against various attacks.

View →

cs.AIcs.LGcs.PLRecentMay 28, 2026

PassNet: Scaling Large Language Models for Graph Compiler Pass Generation

Yiqun Liu, Yingsheng Wu, Ruqi Yang, Enrong Zheng +10 more

The paper introduces PassNet, a large-scale ecosystem for generating compiler passes using LLMs, demonstrating that LLMs can significantly accelerate graph compilation for long-tail workloads, suggest…

View →

cs.ARRecentMay 28, 2026

elasticAI.explorer: Towards a Unified End-to-End Framework for Hardware-Aware Neural Architecture Search

Natalie Maman, Florian Hettstedt, Andreas Erbslöh, Gregor Schiele

The elasticAI.explorer is an extensible, unified Python framework that simplifies hardware-aware Neural Architecture Search (NAS) by decoupling search space definition from model implementation and de…

View →

cs.CCcs.LGcs.LORecentMay 28, 2026

The Complexity of Verifying Feedforward Neural Networks in Quantised Settings

Eric Alsmann, Martin Lange, Marco Sälzer

This paper analyzes the computational complexity of verifying feedforward neural networks when their weights are restricted to finite-width arithmetic, finding that verification remains NP-complete fo…

View →

cs.LGRecentJun 1, 2026

Riemannian Gradient Descent for Low-Rank Architectures

Nicholas Knight

The paper investigates applying Riemannian optimization techniques to low-rank matrix parameters for deep learning, but finds that the proposed methods do not conclusively outperform the AdamW baselin…

View →

cs.CLcs.AIRecentJun 1, 2026

From Layers to Submodules: Rethinking Granularity in Replacement-Based LLM Compression

Elia Cunegatti, Marcus Vukojevic, Erik Nielsen, Giovanni Iacca

The paper proposes SubFit, a novel compression technique that achieves superior LLM compression by replacing non-contiguous, submodule-level components (Attention and FeedForward) with lightweight res…

View →

cs.LGcs.AIcs.DCRecentMay 27, 2026

How Far Can Disaggregation Go? A Design-Space Exploration of Attention-FFN Disaggregation for Efficient MoE LLM Serving

Hanjiang Wu, Abhimanyu Rajeshkumar Bambhaniya, Sarbartha Banerjee, Tuhin Khare +8 more

The paper systematically analyzes the benefits and limits of Attention-FFN Disaggregation (AFD) for Mixture-of-Experts (MoE) LLM serving, demonstrating that AFD is crucial for achieving high throughpu…

View →

cs.LGcs.AIRecentMay 30, 2026

Memory-Efficient LLM Training with Dynamic Sparsity: From Stability to Practical Scaling

Qiao Xiao, Boqian Wu, Patrik Okanovic, Tomasz Sternal +5 more

The paper introduces Sparse Memory-Efficient Training (SMET), a method that stabilizes and optimizes Dynamic Sparse Training (DST) for large language models, enabling stable and memory-efficient spars…

View →

cs.LOcs.AIRecentMay 28, 2026

Neural Network Verification using Partial Multi-Neuron Relaxation

Ido Shmuel, Guy Katz

The paper introduces partial multi-neuron relaxation, a novel verification technique that selectively computes tight linear bounds for a small subset of neurons to improve the efficiency and tightness…

View →

cs.LGcs.AImath.OCRecentMay 28, 2026

Singularity-aware Optimization via Randomized Geometric Probing: Towards Stable Non-smooth Optimization

Ruoran Xu, Borong She, Xiaobo Jin, Qiufeng Wang

The paper introduces Singularity-aware Adam (S-Adam), a novel optimizer that stabilizes deep learning training in non-smooth loss landscapes by dynamically damping updates based on local geometric ins…

View →