20 results for “Sequence analysis”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
The paper introduces a stringology-based fingerprinting (SBF) framework to structurally analyze cryptographic sequences, demonstrating that pattern analysis can reveal measurable structural signatures…
The paper formally models structure-informed multiple sequence alignment (MSA-S) as an NP-complete optimization problem, establishing a strong computational complexity baseline for the field.
Kıvanç Kuzey Dikici, Serdar Kara, Semih Çağlar, Eray Tüzün +1 more
SERSEM introduces a selective entropy-weighted scoring framework to significantly improve Membership Inference Attacks (MIAs) against code LLMs by focusing on human-centric coding anomalies rather tha…
This paper introduces seven novel, cross-domain techniques for detecting prompt injection attacks, moving beyond the limitations of traditional regex and transformer classifiers.
The paper introduces a software platform for generating and analyzing pseudo-random sequences (like LFSR and Mersenne Twister), demonstrating that while these classical generators are efficient, quant…
This paper proposes Stringology-Based Cryptology (SBC), a novel approach that analyzes the structural properties of cryptographic outputs by treating them as symbolic sequences, offering complementary…
The authors introduce Structured PubMed, a comprehensive corpus of section-labeled biomedical abstracts compiled from the complete PubMed database.
This paper measures the lower bound for the shortest program generating a sequence, proving a conservation law and providing a deterministic engine to recover generating programs for certain sequences…
This paper presents methods for ranking and unranking permutations avoiding a pattern of length three in lexicographic or colexicographic order.
Keyue Qiu, Yixin Wu, Lihao Wang, Yawen Ouyang +18 more
The paper introduces AMix-2, a novel protein-text foundation model that unifies protein understanding and sequence design by embedding both modalities in a shared token space, achieving state-of-the-a…
The paper introduces Semantic Triplet Restoration (STR), a novel protocol that converts complex table structures into atomic semantic triplets, improving table question answering by providing explicit…
The paper proposes CYKNN, a novel recurrent neural network architecture that directly encodes the CYK parsing algorithm, demonstrating superior performance over large language models on syntactic pars…
The paper introduces Hyperparam, a set of lightweight JavaScript libraries designed to enable direct, model-aware querying of unstructured data (like agent traces) within client-side AI applications.
The paper identifies a universal, statistically predictable distribution (Mandelbrot) governing LLM outputs, enabling a highly efficient, model-agnostic scoring primitive for provenance and quality as…
This paper introduces BBOmix, an open-source benchmark for unsupervised representation learning on real-world biological data.
SeqShield proposes a behavior-based rootkit detection system for Windows by analyzing API call sequences using n-gram features, achieving high detection accuracy even against mutated malware variants.
This paper benchmarks LLMs for smart contract security analysis, concluding that while LLMs show potential, their reliability is limited by lexical bias and requires integration with traditional stati…
The paper introduces an efficient, novel algorithm for incremental Byte Pair Encoding (BPE) tokenization that processes input text prefix by prefix, achieving significant speedups and enabling streami…
The paper introduces Sieve, a system that uses a large language model (LLM) to generate executable query code from natural language security questions, significantly improving the ability to perform c…
The paper introduces CFGzip, an offline token space compression technique that significantly reduces the computational overhead of constrained decoding, making complex grammar enforcement feasible at…