ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

20 results for “Token-efficient”

CS papers only

Hybrid search: Keyword + semantic, ranked by combined score.ⓘ

Want pure semantic search? Try claim verification →

cs.AIcs.CLRecentMay 28, 2026

Notation Matters: A Benchmark Study of Token-Optimized Formats in Agentic AI Systems

Lorenz Kutschka, Bernhard Geiger

This study benchmarks token-optimized formats (TOON and TRON) against JSON in end-to-end agentic AI systems, finding that TRON significantly reduces token overhead with minimal performance degradation…

View →
cs.CLcs.DSRecentMay 29, 2026

Incremental BPE Tokenization

Shenghu Jiang, Ruihao Gong

The paper introduces an efficient, novel algorithm for incremental Byte Pair Encoding (BPE) tokenization that processes input text prefix by prefix, achieving significant speedups and enabling streami…

View →
cs.LGcs.CLRecentMay 29, 2026

Trading Complexity for Expressivity Through Structured Generalized Linear Token Mixing

Erwan Fagnou, Paul Caillon, Blaise Delattre, Alexandre Allauzen

The paper proposes a unified framework for designing efficient and expressive token mixing layers by separating the direct and recurrent influences of inputs, allowing for a principled trade-off betwe…

View →
cs.AIRecentMay 28, 2026

Accelerating Constrained Decoding with Token Space Compression

Michael Sullivan, Alexander Koller

The paper introduces CFGzip, an offline token space compression technique that significantly reduces the computational overhead of constrained decoding, making complex grammar enforcement feasible at…

View →
cs.LOcs.AIRecentMay 27, 2026

Token Optimization Strategies for LLM-Based Oracle-to-PostgreSQL Migration

Oleg Grynets, Dmytro Babarytskyi, Vasyl Lyashkevych

This paper formalizes token optimization as a multi-objective constrained transformation problem for LLM-based Oracle-to-PostgreSQL migration, demonstrating that adaptive routing offers the best balan…

View →
cs.CVcs.AIRecentMay 28, 2026

OccamToken: Efficient VLM Inference with Training-Free and Budget-Adaptive Token Pruning

Geng Li, Guohao Chen, Ting Chen, Shilin Shan +5 more

OccamToken introduces a training-free, adaptive token pruning framework that replaces fixed token budgets with relative evidence testing against a register-based reference, significantly improving VLM…

View →
cs.CLcs.AIRecentJun 1, 2026

AlphaToken: Decoupling Adaptation and Stability for Path-Aware Response Token Valuation in LLM Post-Training

Liu Qing, Ou Wu, Yi Du

AlphaToken is a novel response token valuation framework that improves LLM post-training by decoupling token selection into task-specific adaptation and stability preservation, leading to better perfo…

View →
cs.CRRecentJun 1, 2026

The Unicity Execution Layer

Ahto Buldas, Dirk Draheim, Mike Gault, Risto Laanoja +2 more

The paper introduces the Unicity Execution Layer, a secure, modular component that enables trustless off-chain transactions while guaranteeing double-spending prevention and enhancing user privacy.

View →
cs.CRcs.AIcs.CCRecentJun 3, 2026

Token Rankings are Unforgeable Language Model Signatures

Matthew Finlayson, Andreas Grivas, Xiang Ren, Swabha Swayamdipta

The paper demonstrates that token rankings provide a unique, unforgeable signature for language models, and proposes an API restriction that allows for signature presentation without leaking model par…

View →
cs.CRcs.AIRecentApr 20, 2026

Understanding Secret Leakage Risks in Code LLMs: A Tokenization Perspective

Meifang Chen, Zhe Yang, Huang Nianchen, Yizhan Huang +3 more

This paper investigates how Byte-Pair Encoding (BPE) tokenization causes Code LLMs to disproportionately memorize certain types of secrets, a phenomenon termed 'gibberish bias'.

View →
cs.CRcs.AIcs.CLRecentMay 28, 2026

Token Inflation: How Dishonest Providers Can Overcharge for Large Language Model Usage

Shahinul Hoque, Jinghuai Zhang, Jinyuan Sun, Fnu Suya

The paper demonstrates that the current per-token billing model for LLMs is susceptible to systematic overcharging because auditing frameworks must rely on evidence provided by the very companies that…

View →
cs.CRcs.AIcs.CLRecentMay 28, 2026

Token Inflation: How Dishonest Providers Can Overcharge for Large Language Model Usage

Shahinul Hoque, Jinghuai Zhang, Jinyuan Sun, Fnu Suya

The paper demonstrates that the current per-token billing model for LLMs is susceptible to systematic inflation because auditing frameworks must rely on evidence provided by the service provider, crea…

View →
cs.CRcs.AIcs.LGRecentMar 24, 2026

Not All Tokens Are Created Equal: Query-Efficient Jailbreak Fuzzing for LLMs

Wenyu Chen, Xiangtao Meng, Chuanchao Zang, Li Wang +5 more

The paper proposes TriageFuzz, a token-aware fuzzing framework that significantly reduces the number of queries needed to jailbreak LLMs while maintaining high attack success rates.

View →
cs.CRcs.AIRecentMar 30, 2026

Privacy Guard & Token Parsimony by Prompt and Context Handling and LLM Routing

Alessio Langiu

The paper introduces a 'Privacy Guard' framework that simultaneously reduces operational costs and eliminates data leakage risks when using LLMs by optimizing prompts and routing queries to secure mod…

View →
cs.CRRecentJun 1, 2026

Unicity: Predicates and Atomic Swaps

Ahto Buldas, Dirk Draheim, Mike Gault, Risto Laanoja +2 more

The paper generalizes Unicity token ownership using programmable spending conditions called predicates, enabling trustless atomic swaps and smart-contract-like functionality executed off-chain.

View →
cs.CVcs.AIEmpiricalRecentJun 10, 2026

Reroute, Don't Remove: Recoverable Visual Token Routing for Vision-Language Models

Cheng-Yu Yang, Shao-Yuan Lo, Yu-Lun Liu

肖代替了视觉令牌的永久删除,通过可恢复的路由来改进视觉语言模型的性能

View →
cs.CLcs.CRRecentJun 2, 2026

Selective Token-Level Cryptographic Redaction for Privacy-Preserving Clinical Deployment of Large Language Models

Farhan Sheth, Ziyuan Yang, Yongying Lan, Si Yong Yeo

The paper introduces HERALD, a token-level cryptographic redaction framework that encrypts only sensitive tokens in clinical text, enabling privacy-preserving LLM deployment without significant loss o…

View →
cs.SEcs.CRRecentApr 1, 2026

SERSEM: Selective Entropy-Weighted Scoring for Membership Inference in Code Language Models

Kıvanç Kuzey Dikici, Serdar Kara, Semih Çağlar, Eray Tüzün +1 more

SERSEM introduces a selective entropy-weighted scoring framework to significantly improve Membership Inference Attacks (MIAs) against code LLMs by focusing on human-centric coding anomalies rather tha…

View →
cs.CRcs.AIcs.CLRecentMay 6, 2026

Sparse Tokens Suffice: Jailbreaking Audio Language Models via Token-Aware Gradient Optimization

Zheng Fang, Xiaosen Wang, Shenyi Zhang, Shaokang Wang +1 more

The paper introduces Token-Aware Gradient Optimization (TAGO), demonstrating that sparse optimization focusing only on high-gradient audio tokens is sufficient for effective jailbreaking of audio lang…

View →
cs.CVcs.AIcs.CLRecentMay 31, 2026

On the Limits of Token Reduction for Efficient Unified Vision Language Training

Siyi Chen, Weiming Zhuang, Jingtao Li, Lingjuan Lv

The paper analyzes token reduction for efficient unified VLM training, finding that while task-specific acceleration saves computation, it destroys the mutual performance gains achieved through joint…

View →