20 results for “Hamming weight binary words”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
This paper introduces a novel algorithm for generating k Hamming weight binary words in linear time while minimizing random bit consumption.
This paper completely characterizes cyclic equalizability for two words over any finite alphabet by proving that the words must share the same Parikh vector.
The paper proposes a robust, multi-stage pipeline combining rule-based classification and machine learning to map noisy retail product names to standardized consumption categories, finding that simple…
Kıvanç Kuzey Dikici, Serdar Kara, Semih Çağlar, Eray Tüzün +1 more
SERSEM introduces a selective entropy-weighted scoring framework to significantly improve Membership Inference Attacks (MIAs) against code LLMs by focusing on human-centric coding anomalies rather tha…
The paper proposes SSG, a novel logit-balanced vocabulary partitioning method, to enhance the watermark strength and detectability of LLM-generated content, especially in low-entropy domains like code…
This paper systematically analyzes combining dimensionality reduction and quantization to compress text embeddings, showing that this combined approach achieves substantial compression (e.g., 0.1% siz…
Clark Hash is a stateless, deterministic quantization method that significantly reduces the storage size of neural embeddings while maintaining high accuracy for cosine similarity search.
XMark introduces a novel multi-bit watermarking technique that reliably embeds binary messages into LLM-generated text while maintaining high text quality and robust performance even with limited toke…
This paper presents methods for ranking and unranking permutations avoiding a pattern of length three in lexicographic or colexicographic order.
The paper introduces the quotient semivalue mechanism to provide fair data attribution that is resistant to contributors manipulating their reported identities by splitting or duplicating data.
The paper proposes SLAT, a segment-level adaptive trimming framework, which efficiently reduces redundant reasoning in large language model CoT outputs by selectively suppressing segments with low mar…
Yuexin Li, Wenjie Qu, Linyu Wu, Yulin Chen +4 more
AliMark proposes a novel framework that enhances the robustness of sentence-level watermarking by reformulating the problem as a bit sequence encoding and alignment task, significantly improving resil…
Yuexin Li, Wenjie Qu, Linyu Wu, Yulin Chen +4 more
AliMark proposes a novel watermarking framework that treats sentence-level watermarking as a bit sequence alignment problem, significantly enhancing robustness against structural text perturbations li…
Minghui Zheng, Hongxu Chen, Huimin Ren, Hongsheng Xin +7 more
HMPO introduces a single-stage, cost-effective reinforcement learning framework that achieves significant token compression of Chain-of-Thought reasoning with minimal loss of accuracy, applicable acro…
The paper quantitatively confirms the Currier A/B language distinction in the Voynich Manuscript, demonstrating it is governed by a higher-dimensional, context-dependent boolean switch rather than a s…
This study systematically evaluates a wide range of chunking methods for Retrieval-Augmented Generation (RAG) to assess their effectiveness and highlight the overlooked challenges associated with chun…
The paper proposes an efficient and provably secure linguistic steganography method using range coding that achieves high embedding capacity and speed, outperforming existing methods.
Elevator is a novel, deterministic binary translator that statically translates entire x86-64 executables to AArch64 by considering all possible interpretations of every byte, eliminating the need for…
The paper proposes CYKNN, a novel recurrent neural network architecture that directly encodes the CYK parsing algorithm, demonstrating superior performance over large language models on syntactic pars…