"Sequence analysis" | ArxivCSExplorer

20 results for “Sequence analysis”

CS papers only

Hybrid search: Keyword + semantic, ranked by combined score.ⓘ

Want pure semantic search? Try claim verification →

cs.CRRecentMay 18, 2026

Structural Analysis of Cryptographic Sequences using Stringology-Based Fingerprinting

The paper introduces a stringology-based fingerprinting (SBF) framework to structurally analyze cryptographic sequences, demonstrating that pattern analysis can reveal measurable structural signatures…

View →

cs.CCq-bio.QMRecentJun 1, 2026

Structure-Informed Multiple Sequence Alignment: A Formal Model and Hardness Results

Yoshiki Kanazawa, Naphan Benchasattabuse, Michal Hajdušek, Rodney Van Meter

The paper formally models structure-informed multiple sequence alignment (MSA-S) as an NP-complete optimization problem, establishing a strong computational complexity baseline for the field.

View →

cs.SEcs.CRRecentApr 1, 2026

SERSEM: Selective Entropy-Weighted Scoring for Membership Inference in Code Language Models

Kıvanç Kuzey Dikici, Serdar Kara, Semih Çağlar, Eray Tüzün +1 more

SERSEM introduces a selective entropy-weighted scoring framework to significantly improve Membership Inference Attacks (MIAs) against code LLMs by focusing on human-centric coding anomalies rather tha…

View →

cs.CRcs.CLRecentApr 20, 2026

Beyond Pattern Matching: Seven Cross-Domain Techniques for Prompt Injection Detection

Thamilvendhan Munirathinam

This paper introduces seven novel, cross-domain techniques for detecting prompt injection attacks, moving beyond the limitations of traditional regex and transformer classifiers.

View →

quant-phcs.CRRecentMay 29, 2026

Software Platform for Hybrid Pseudo-Random Sequence Generation and Predictability Analysis Based on LFSR and Mersenne Twister

Ali Abdolrahimi Zarnagh, Ali Motazedifard

The paper introduces a software platform for generating and analyzing pseudo-random sequences (like LFSR and Mersenne Twister), demonstrating that while these classical generators are efficient, quant…

View →

cs.CRRecentApr 17, 2026

Stringology Based Cryptology

Victor Kebande

This paper proposes Stringology-Based Cryptology (SBC), a novel approach that analyzes the structural properties of cryptographic outputs by treating them as symbolic sequences, offering complementary…

View →

cs.IRcs.CLDatasetRecentJun 9, 2026

A PubMed-Scale Dataset of Structured Biomedical Abstracts

Chia-Hsuan Chang, Haerin Song, Brian Ondov, Hua Xu

The authors introduce Structured PubMed, a comprehensive corpus of section-labeled biomedical abstracts compiled from the complete PubMed database.

View →

cs.CCcs.LGTheoreticalRecentJun 11, 2026

The Program Is Still There: A Conservation Law for Program Discovery

Jorge Miguel Silva

This paper measures the lower bound for the shortest program generating a sequence, proving a conservation law and providing a deterministic engine to recover generating programs for certain sequences…

View →

cs.DScs.DMTheoreticalRecentJun 11, 2026

(Un)ranking Permutation Classes

Nathanaël Hassler, Vincent Vajnovszki

This paper presents methods for ranking and unranking permutations avoiding a pattern of length three in lexicographic or colexicographic order.

View →

q-bio.BMcs.AIRecentMay 29, 2026

AMix-2: Establishing Protein as a Native Modality in Large Language Models

Keyue Qiu, Yixin Wu, Lihao Wang, Yawen Ouyang +18 more

The paper introduces AMix-2, a novel protein-text foundation model that unifies protein understanding and sequence design by embedding both modalities in a shared token space, achieving state-of-the-a…

View →

cs.CLRecentMay 29, 2026

Semantic Triplet Restoration: A Novel Protocol for Hierarchical Table Understanding in Large Language Models

Yibin Zhao, Fangxin Shang, Dingrui Yang, Yuqi Wang

The paper introduces Semantic Triplet Restoration (STR), a novel protocol that converts complex table structures into atomic semantic triplets, improving table question answering by providing explicit…

View →

cs.CLcs.AIcs.DSRecentMay 29, 2026

Neuro-symbolic Syntactic Parsing: Shaping a Neural Network with the CYK Algorithm

Fabio Massimo Zanzotto, Federico Ranaldi, Giorgio Satta

The paper proposes CYKNN, a novel recurrent neural network architecture that directly encodes the CYK parsing algorithm, demonstrating superior performance over large language models on syntactic pars…

View →

cs.AIcs.DBRecentMay 27, 2026

A Query Engine for the Agents

Kenny Daniel

The paper introduces Hyperparam, a set of lightweight JavaScript libraries designed to enable direct, model-aware querying of unstructured data (like agent traces) within client-side AI applications.

View →

cs.CRcs.CLRecentApr 28, 2026

The Surprising Universality of LLM Outputs: A Real-Time Verification Primitive

Alex Bogdan, Adrian de Valois-Franklin

The paper identifies a universal, statistically predictable distribution (Mandelbrot) governing LLM outputs, enabling a highly efficient, model-agnostic scoring primitive for provenance and quality as…

View →

cs.LGRecentJun 3, 2026

BBOmix: A Tabular Benchmark for Hyperparameter Optimization of Unsupervised Biological Representation Learning

Luca Thale-Bombien, Jan Ewald, Ralf König, Aaron Klein

This paper introduces BBOmix, an open-source benchmark for unsupervised representation learning on real-world biological data.

View →

cs.CRcs.LGRecentApr 26, 2026

SeqShield: A Behavioral Analysis Approach to Uncover Rootkits

Paras Ghodeshwar, Sandeep K Shukla, Anand Handa, Nitesh Kumar

SeqShield proposes a behavior-based rootkit detection system for Windows by analyzing API call sequences using n-gram features, achieving high detection accuracy even against mutated malware variants.

View →

cs.CRcs.AIRecentMay 11, 2026

Benchmarking LLM-Based Static Analysis for Secure Smart Contract Development: Reliability, Limitations, and Potential Hybrid Solutions

Stefan-Claudiu Susan, Andrei Arusoaie, Dorel Lucanu

This paper benchmarks LLMs for smart contract security analysis, concluding that while LLMs show potential, their reliability is limited by lexical bias and requires integration with traditional stati…

View →

cs.CLcs.DSRecentMay 29, 2026

Incremental BPE Tokenization

Shenghu Jiang, Ruihao Gong

The paper introduces an efficient, novel algorithm for incremental Byte Pair Encoding (BPE) tokenization that processes input text prefix by prefix, achieving significant speedups and enabling streami…

View →

cs.CRRecentMay 21, 2026

Parser-Free Querying of Security Logs

Evan Luo, Julien Piet, David Wagner

The paper introduces Sieve, a system that uses a large language model (LLM) to generate executable query code from natural language security questions, significantly improving the ability to perform c…

View →

cs.AIRecentMay 28, 2026

Accelerating Constrained Decoding with Token Space Compression

Michael Sullivan, Alexander Koller

The paper introduces CFGzip, an offline token space compression technique that significantly reduces the computational overhead of constrained decoding, making complex grammar enforcement feasible at…

View →