20 results for “Mamba SSM”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
The paper proposes SISA (SSM-Informed Softmax Attention), a novel hybrid attention mechanism that integrates state-space model (SSM) importance signals directly into the attention score, achieving sta…
Zamba2-VL is a new suite of vision-language models built on the Zamba2 hybrid architecture, achieving state-of-the-art performance and significantly improved inference efficiency compared to leading T…
This study systematically evaluates Vision Mamba models for detecting AI-generated images, finding that while they show promise, their current strengths and limitations must be understood relative to…
Deyu Zhuang, Peiliang Gong, Yang Shao, Liyuan Shu +3 more
The paper proposes PC-MambaSDE, a physically-constrained continuous-time framework that accurately predicts Remaining Useful Life (RUL) despite irregular sensor observations and ensures physically pla…
Pingping Liu, Aohua Li, Yubing Lu, Jin Kuang +2 more
The paper proposes RPCASSM, a novel state space model leveraging Robust PCA (RPCA) to accurately detect and segment infrared small targets by separately modeling background and target information base…
The paper demonstrates that in Mamba-2, single-bucket probes can detect a large functional signature (detection layer) that is not fully responsible for the actual computation (execution layer), chall…
This study benchmarks four local LLMs for natural-language-to-SQL querying in biopharma manufacturing, finding that general-purpose code-tuned models like Llama 3.1 8B and Qwen 2.5 Coder 7B outperform…
EnergyMamba proposes an uncertainty-aware, graph-enhanced selective state space model to significantly improve both the accuracy and reliability of energy consumption prediction by explicitly modeling…
This systematic review analyzes the current state of SMS phishing (smishing) attacks and defenses, organizing existing research into four pillars to identify gaps and propose future mitigation strateg…
MyoSem introduces an EMG-action semantic alignment framework that transforms low-level muscle signals into a shared semantic space, enabling bidirectional retrieval between EMG data and natural langua…
DeltaMCP is a specification-aware, incremental regeneration tool that efficiently updates Model Context Protocol (MCP) servers by only modifying affected tooling when a service's OpenAPI specification…
The paper develops a general framework for dynamic consistent submodular maximization, achieving constant-factor approximations with sublinear consistency for both cardinality and rank-$k$ matroid con…
The paper introduces a novel, non-deep neural network architecture that achieves the performance of LLMs by finding the global optimum of the loss function in a single, closed-form iteration, eliminat…
The paper proposes a deterministic, version-aware aggregation method that significantly outperforms existing LLM-based systems for resolving memory conflicts in fact consolidation tasks.
MambaNetBurst introduces a compact, tokenizer-free byte-level classifier using a Mamba-2 backbone to achieve strong network traffic classification without requiring pre-training or complex data prepro…
The study demonstrates that LLMs exhibit significant, language-driven disparities in medical triage recommendations, recommending emergency care more frequently for English and Arabic prompts, even wh…
The paper formally models structure-informed multiple sequence alignment (MSA-S) as an NP-complete optimization problem, establishing a strong computational complexity baseline for the field.
This paper formalizes token optimization as a multi-objective constrained transformation problem for LLM-based Oracle-to-PostgreSQL migration, demonstrating that adaptive routing offers the best balan…
Yalun Dai, Yangyu Huang, Tongshen Yang, Yonghan Wang +7 more
This paper proposes four guidelines and two novel data ordering methods (STR and SAW) to systematically optimize data organization, significantly enhancing the stability and performance of LLM trainin…
Leo Linqian Gan, Jeffery Wu, Longyuan Ge, Lanqing Yang +5 more
ClawGuard introduces a passive, out-of-band security monitor that detects LLM agent workflow hijacking by analyzing unique electromagnetic (EM) emanations generated during agent skill execution.