20 results for “multilayer neural networks”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
This paper investigates limitations of learning tanh neural networks under finite-precision computations and Lp accuracy guarantees.
Senmiao Wang, Tiantian Fang, Haoran Zhang, Yushun Zhang +3 more
This paper proposes a preconditioning layer for stable weight conditioning in LLM training.
This paper introduces a mechanistic neuronal network model for multilayer learning, offering biological insights and an alternative to backpropagation.
The paper proposes two novel multi-column RBFN architectures, MC-PSO and MC-APSO, that combine parallel RBFN structures with swarm optimization to significantly outperform existing methods in accuracy…
The paper introduces a novel, non-deep neural network architecture that achieves the performance of LLMs by finding the global optimum of the loss function in a single, closed-form iteration, eliminat…
The paper introduces partial multi-neuron relaxation, a novel verification technique that selectively computes tight linear bounds for a small subset of neurons to improve the efficiency and tightness…
The paper introduces Automatically Differentiable Nonlinear Tensor Networks (ADNTNs) to achieve massive, structured compression of deep neural networks, demonstrating compression ratios up to 77,000x…
The paper analyzes the algorithmic complexity of finding collisions in single-layer binary neural networks, establishing that the collision resistance depends critically on the activation function's t…
The paper analyzes congruence-based neural architectures for classifying positive-definite matrices, demonstrating that common semi-orthogonality constraints severely limit the model's expressivity.
This paper analyzes the computational complexity of verifying feedforward neural networks when their weights are restricted to finite-width arithmetic, finding that verification remains NP-complete fo…
The paper introduces the Vector Network (VN), a novel recurrent architecture that replaces fixed weight matrices with reusable weight atoms, enabling superior compositional generalization by making st…
Liwen Jing, Yisha Lu, Tingting Yang, Li Sun +4 more
The paper introduces SpikeWFM, a novel hybrid architecture combining spiking neural networks (SNNs) and transformers, which significantly improves the robustness and accuracy of wireless foundation mo…
The paper proposes a novel multimodal learning approach to predict the properties of new bilayer 2D materials formed by stacking dissimilar functional layers.
The paper introduces Residualized Sparse Autoencoders (ReSAEs) to improve multi-layer interventions in transformers by training each layer on the residual activation, which better preserves cross-laye…
LayerRoute introduces a lightweight, input-conditioned adapter that selectively skips transformer blocks in agentic language models, achieving significant FLOPs reduction while improving performance.
The paper proposes PG-RSSNN, a physics-guided recurrent state-space neural network that improves multi-step prediction stability and accuracy compared to both pure black-box and pure physics models, e…
The paper proposes a multi-dimensional evaluation framework to assess EEG foundation models under realistic low-resource conditions, finding that while these models excel in long-context tasks, their…
Arnaud Descours, Arnaud Guillin, Geoffrey Lacour, Manon Michel +2 more
This paper develops a novel, computationally efficient method to quantify the uncertainty in wide neural network predictions by characterizing the limiting random fluctuations using stochastic evoluti…
The paper proposes CYKNN, a novel recurrent neural network architecture that directly encodes the CYK parsing algorithm, demonstrating superior performance over large language models on syntactic pars…
The paper analyzes a new class of asynchronous adaptive first-order optimization methods and proves their stochastic convergence rate is O(1/sqrt{t}) for non-convex functions.