ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

20 results for “DeepSeek-V3”

CS papers only

Hybrid search: Keyword + semantic, ranked by combined score.ⓘ

Want pure semantic search? Try claim verification →

cs.LGcs.AIcs.DCRecentMay 27, 2026

How Far Can Disaggregation Go? A Design-Space Exploration of Attention-FFN Disaggregation for Efficient MoE LLM Serving

Hanjiang Wu, Abhimanyu Rajeshkumar Bambhaniya, Sarbartha Banerjee, Tuhin Khare +8 more

The paper systematically analyzes the benefits and limits of Attention-FFN Disaggregation (AFD) for Mixture-of-Experts (MoE) LLM serving, demonstrating that AFD is crucial for achieving high throughpu…

View →
cs.CLRecentMay 28, 2026

Probing the Prompt KV Cache: Where It Becomes Dispensable

Vinayshekhar Bannihatti Kumar, Manoj Ghuhan Arivazhagan, Disha Makhija, Rashmi Gangadharaiah

This paper investigates the redundancy of the prompt KV cache during language model decoding, finding that the structure provided by chat templates is the primary source of redundancy, not the actual…

View →
cs.CVcs.AIEmpiricalRecentJun 10, 2026

Reroute, Don't Remove: Recoverable Visual Token Routing for Vision-Language Models

Cheng-Yu Yang, Shao-Yuan Lo, Yu-Lun Liu

肖代替了视觉令牌的永久删除,通过可恢复的路由来改进视觉语言模型的性能

View →
cs.CVcs.AIRecentMay 28, 2026

OccamToken: Efficient VLM Inference with Training-Free and Budget-Adaptive Token Pruning

Geng Li, Guohao Chen, Ting Chen, Shilin Shan +5 more

OccamToken introduces a training-free, adaptive token pruning framework that replaces fixed token budgets with relative evidence testing against a register-based reference, significantly improving VLM…

View →
cs.AIRecentMay 31, 2026

TriLens: Per-Layer Logit-Lens Entropy for White-Box Hallucination Detection

Bohan Yang, Yijun Gong, Zhi Zhang, Ge Zhang +2 more

TriLens is a white-box detector that monitors the entropy of three internal streams (attention, feed-forward, residual) at every layer of a language model to detect hallucinations by tracking how inte…

View →
cs.CVcs.AIcs.CRRecentMar 30, 2026

Detection of Adversarial Attacks in Robotic Perception

Ziad Sharawy, Mohammad Nakshbandi, Sorin Mihai Grigorescu

This paper addresses the vulnerability of DNNs used in robotic semantic segmentation to adversarial attacks by proposing specialized detection strategies to enhance safety in robotic perception system…

View →
cs.ROcs.AIcs.CVRecentMay 31, 2026

DeepIPCv3: Event-Aware Multi-Modal Sensor Fusion for Sudden Pedestrian Crossing Avoidance

Oskar Natan, Andi Dharmawan, Aufaclav Zatu Kusuma Frisky, Jazi Eko Istiyanto +1 more

DeepIPCv3 is a novel multi-modal framework that fuses LiDAR and DVS event streams using cross-modal attention to achieve state-of-the-art, highly reactive avoidance maneuvers for sudden pedestrian cro…

View →
cs.LGcs.AIRecentMay 29, 2026

BudgetDraft: Acceptance-Aware Multi-View Training for Sparse-KV Speculative Decoding

Liang He, Jingbo Wen, Qishi Zhan, Yixiong Chen +3 more

BudgetDraft introduces an acceptance-aware multi-view training method that trains a sparse-KV speculative decoder to maintain high acceptance rates across varying context lengths and sparsity levels,…

View →
cs.LGcs.AIRecentMay 27, 2026

Locality-Aware Redundancy Pruning for LLM Depth Compression

Vincent-Daniel Yun, Youngrae Kim, Woosang Lim, YoungJin Heo +2 more

The paper proposes Locality-Aware Redundancy Pruning (LoRP), a training-free method that prunes LLM layers by exploiting localized inter-layer redundancy, leading to improved efficiency while maintain…

View →
cs.AIRecentMay 30, 2026

TAPS: Target-Aware Prefix Tree Selection for Diffusion-Drafted Speculative Decoding

Zhuoyu Wang, Junnan Huang, Xinyu Chen

TAPS introduces a target-aware prefix selection method that optimizes the trade-off between draft tree acceptance and verification cost, achieving significant speedups in speculative decoding.

View →
cs.CVcs.LGeess.IVRecentJun 3, 2026

An Open-Source Two-Stage Computer Vision Pipeline for Fine-Grained Vehicle Classification using Vision Transformers

Gandhimathi Padmanaban, Fred Feng

This paper presents an open-source computer vision pipeline for classifying vehicle body types from naturalistic roadway video.

View →
cs.CRcs.LGcs.SERecentMay 16, 2026

The Range Shrinks, the Threat Remains: Re-evaluating LLM Package Hallucinations on the 2026 Frontier-Model Cohort

Aleksandr Churilov

This study re-evaluates LLM package hallucination rates on a new cohort of frontier models, finding a significant reduction in overall hallucination rates but identifying a persistent, model-agnostic…

View →
cs.CVcs.CRRecentMay 17, 2026

Deepfake Detection in Social Media: A Temporal Artifact Analysis Using 3D Convolutional Neural Networks

Mohammadreza Rashidi, Raja Hashim Ali, Sami Ur Rahman

This paper proposes a 3D CNN detector that leverages temporal artifacts to accurately identify high-quality deepfake videos, demonstrating robust detection even after social media re-encoding.

View →
cs.CVcs.AIRecentMay 28, 2026

Benchmarking Large Vision-Language Models on CFMME: A Comprehensive Chinese Financial Multimodal Evaluation Dataset

Qian Chen, Xianyin Zhang, Yanzhi Liu, Lifan Guo +2 more

This paper introduces CFMME, a comprehensive Chinese financial multimodal benchmark, and evaluates current Large Vision-Language Models (LVLMs), finding that while state-of-the-art models perform mode…

View →
cs.CLcs.AIRecentMay 27, 2026

PrunePath: Towards Highly Structured Sparse Language Models

Zhexuan Gu, Zixun Fu, Yancheng Yuan

PrunePath introduces a budget-adaptive structured sparsification framework that efficiently prunes Feed-forward networks in large language models, achieving hardware-friendly sparsity and measurable s…

View →
cs.CRcs.LGRecentApr 7, 2026

AttnDiff: Attention-based Differential Fingerprinting for Large Language Models

Haobo Zhang, Zhenhua Xu, Junxian Li, Shangfeng Sheng +2 more

AttnDiff introduces a data-efficient white-box framework that extracts intrinsic attention-based fingerprints to verify the provenance and detect unauthorized derivation of large language models (LLMs…

View →
cs.CVcs.AIcs.LGRecentJun 1, 2026

A Structured Benchmark for Text-Guided Anomaly Detection: When Language Stops Conditioning the Decision

Stefano Samele, Eugenio Lomurno, Teodora Jovanovic, Sanjay Shivakumar Manohar +2 more

The paper introduces a structured benchmark (TGAD) showing that current text-guided anomaly detection models often overstate their language conditioning, as performance significantly degrades when the…

View →
cs.AIRecentMay 27, 2026

Clark Hash: Stateless Sparse Johnson-Lindenstrauss Quantization for Neural Embeddings

Stanislav Kirdey, Clark Labs Inc

Clark Hash is a stateless, deterministic quantization method that significantly reduces the storage size of neural embeddings while maintaining high accuracy for cosine similarity search.

View →
cs.ARcs.AIcs.DCRecentMay 28, 2026

Memory-Bound but Not Bandwidth-Limited: The Physical AI Inference Gap in Batch-1 LLM Decode

Josef Chen

Physical AI inference (batch-1 decode) is primarily memory-bandwidth-bound, but the observed latency gap between fast and slow GPUs is not solely due to memory bandwidth, as launch-side overheads beco…

View →
cs.CRcs.AIRecentJun 3, 2026

Search-Time Contamination in Deep Research Agents: Measuring Performance Inflation in Public Benchmark Evaluation

Yongjie Wang, Xinyue Zhang, Kunhong Yao, Zhiwei Zeng +3 more

The paper introduces the concept of Search-Time Contamination (STC), demonstrating that deep research agents can leak information from public benchmarks via web search, leading to an overestimation of…

View →