20 results for “Token-efficient”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
This study benchmarks token-optimized formats (TOON and TRON) against JSON in end-to-end agentic AI systems, finding that TRON significantly reduces token overhead with minimal performance degradation…
The paper introduces an efficient, novel algorithm for incremental Byte Pair Encoding (BPE) tokenization that processes input text prefix by prefix, achieving significant speedups and enabling streami…
The paper proposes a unified framework for designing efficient and expressive token mixing layers by separating the direct and recurrent influences of inputs, allowing for a principled trade-off betwe…
The paper introduces CFGzip, an offline token space compression technique that significantly reduces the computational overhead of constrained decoding, making complex grammar enforcement feasible at…
This paper formalizes token optimization as a multi-objective constrained transformation problem for LLM-based Oracle-to-PostgreSQL migration, demonstrating that adaptive routing offers the best balan…
Geng Li, Guohao Chen, Ting Chen, Shilin Shan +5 more
OccamToken introduces a training-free, adaptive token pruning framework that replaces fixed token budgets with relative evidence testing against a register-based reference, significantly improving VLM…
AlphaToken is a novel response token valuation framework that improves LLM post-training by decoupling token selection into task-specific adaptation and stability preservation, leading to better perfo…
Ahto Buldas, Dirk Draheim, Mike Gault, Risto Laanoja +2 more
The paper introduces the Unicity Execution Layer, a secure, modular component that enables trustless off-chain transactions while guaranteeing double-spending prevention and enhancing user privacy.
The paper demonstrates that token rankings provide a unique, unforgeable signature for language models, and proposes an API restriction that allows for signature presentation without leaking model par…
Meifang Chen, Zhe Yang, Huang Nianchen, Yizhan Huang +3 more
This paper investigates how Byte-Pair Encoding (BPE) tokenization causes Code LLMs to disproportionately memorize certain types of secrets, a phenomenon termed 'gibberish bias'.
The paper demonstrates that the current per-token billing model for LLMs is susceptible to systematic overcharging because auditing frameworks must rely on evidence provided by the very companies that…
The paper demonstrates that the current per-token billing model for LLMs is susceptible to systematic inflation because auditing frameworks must rely on evidence provided by the service provider, crea…
Wenyu Chen, Xiangtao Meng, Chuanchao Zang, Li Wang +5 more
The paper proposes TriageFuzz, a token-aware fuzzing framework that significantly reduces the number of queries needed to jailbreak LLMs while maintaining high attack success rates.
The paper introduces a 'Privacy Guard' framework that simultaneously reduces operational costs and eliminates data leakage risks when using LLMs by optimizing prompts and routing queries to secure mod…
Ahto Buldas, Dirk Draheim, Mike Gault, Risto Laanoja +2 more
The paper generalizes Unicity token ownership using programmable spending conditions called predicates, enabling trustless atomic swaps and smart-contract-like functionality executed off-chain.
肖代替了视觉令牌的永久删除,通过可恢复的路由来改进视觉语言模型的性能
The paper introduces HERALD, a token-level cryptographic redaction framework that encrypts only sensitive tokens in clinical text, enabling privacy-preserving LLM deployment without significant loss o…
Kıvanç Kuzey Dikici, Serdar Kara, Semih Çağlar, Eray Tüzün +1 more
SERSEM introduces a selective entropy-weighted scoring framework to significantly improve Membership Inference Attacks (MIAs) against code LLMs by focusing on human-centric coding anomalies rather tha…
Zheng Fang, Xiaosen Wang, Shenyi Zhang, Shaokang Wang +1 more
The paper introduces Token-Aware Gradient Optimization (TAGO), demonstrating that sparse optimization focusing only on high-gradient audio tokens is sufficient for effective jailbreaking of audio lang…
The paper analyzes token reduction for efficient unified VLM training, finding that while task-specific acceleration saves computation, it destroys the mutual performance gains achieved through joint…