20 results for “Inference-time augmentation”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
Davood Fattahi, Runze Yan, Saurabh Kataria, Zhaoliang Chen +1 more
This paper proposes a unified framework for inference-time augmentation to improve the robustness of physiological signal classification in real-world deployments.
Wenhao Liu, Hao Shi, Yunhe Li, Weizhi Fei +6 more
This paper proposes a training-free framework called ReasonAlloc to mitigate inference bottlenecks in large language models by recasting decoding-time key-value compression as a hierarchical budget al…
Guancheng Tu, Xiangjun Fu, Suhao Yu, Yao Tang +4 more
This paper proposes NF-CoT, a latent reasoning framework that preserves the advantages of chain-of-thought in large language models.
Yaxuan Kong, Qingren Yao, Yuqi Nie, Yichen Li +6 more
The paper introduces TimeSage-MT, a comprehensive multi-turn benchmark designed to rigorously test an LLM agent's ability to perform complex, evolving time series analysis, revealing critical gaps in…
The paper proposes an efficient inference procedure for generative planning models by modifying the Open-Closed List (OCL) search, achieving superior performance over existing baselines.
Zhi Zhou, Ming Yang, Shi-Yu Tian, Kun-Yang Yu +2 more
The paper establishes the first theoretical framework for analyzing the learnability of Test-Time Adaptation (TTA) under non-stationary data streams by introducing Recovery Complexity, which quantifie…
The paper proposes a unified framework to evaluate how different types of memory transfer benefit multi-trajectory inference for tool-use LLM agents, finding that the optimal memory method depends cri…
Divergence Decoding (DD) is a novel, effective, and inexpensive method that uses auxiliary models to steer LLM logits during inference, enabling the removal of memorized sensitive data without signifi…
Minkyung Kwon, Jinhyeok Choi, Youngjin Shin, Jaeyeong Kim +2 more
MORPHOS is a novel autoregressive framework that generates dynamic 3D assets (like meshes and radiance fields) from videos by using a unified 4D representation to ensure temporal consistency and handl…
The paper analyzes the failure modes of aggressive 2-bit quantization in large reasoning models, proposing lightweight controls like FP16 planning and loop rescue to restore accuracy and achieve pract…
ChronosAD introduces a novel architecture that uses time series foundation models and a custom Temporal Block to achieve robust and highly accurate anomaly detection across diverse domains.
Yuefeng Peng, Mingzhe Li, Kejing Xia, Renhao Zhang +1 more
This paper presents the first systematic study of membership inference attacks (MIAs) against Vision-Language-Action (VLA) models, demonstrating that these models are highly vulnerable to privacy brea…
Kaiyu Huang, Xingyu Wang, Mingze Kong, Zhubo Shi +5 more
UniScale proposes a unified framework that jointly optimizes model routing and test-time scaling to achieve a superior, fine-grained quality-cost trade-off for large language model inference.
The paper proposes Continuous Reasoning for Vision-Language-Action (VLA) models, arguing that effective reasoning must be a shared, verifiable internal latent space rather than discrete text tokens, l…
Wenwu Li, Yuran Song, Mingze Zhao, Bo Jin +1 more
The paper proposes a novel temporal and structural credit assignment framework to efficiently optimize multi-agent LLM systems by decomposing the error signal and using targeted, discrete gradient upd…
Dong Jing, Jingchen Nie, Tianqi Zhang, Jiaqi Liu +3 more
TempoVLA is a novel Vision-Language-Action model that enables controllable execution speed for robot manipulation by explicitly conditioning the policy on the desired speed.
The paper challenges the conclusion that LLMs lack reasoning by demonstrating that reported performance drops on GSM-Symbolic are often statistically weak and partially attributable to dataset biases,…
Haoming Xu, Weihong Xu, Zongrui Li, Mengru Wang +5 more
The paper introduces Contextual Belief Management (CBM) to address how LLMs should manage accumulating information over long interactions, showing that reinforcement learning significantly improves be…
Yisen Gao, Yixi Cai, Tianshi Zheng, Jiaxin Bai +1 more
HypoAgent is an agentic framework that enables interactive, multi-turn abductive hypothesis generation over knowledge graphs, achieving state-of-the-art performance by integrating specialized agents f…
Jiakang Li, Guanyu Zhu, Can Jin, Chenxi Huang +7 more
The paper introduces Latent Reward Steering (LRS), an adaptive inference-time framework that implicitly improves the reasoning ability of LLMs by guiding the model's internal latent states based on a…