20 results for “pitch”
CS papers onlyHybrid search: Keyword + semantic, ranked by combined score.ⓘ
Want pure semantic search? Try claim verification →
A Wave-U-Net model is trained to extract a fundamental waveform from input speech signals for accurate and robust instantaneous pitch estimation.
The paper proposes methods for generating global prosodic embeddings using auto-encoder models of pitch and energy, demonstrating competitive or superior performance under challenging conditions.
This paper investigates if team-based interaction improves LLM performance on complex reasoning tasks (ChGK), finding that structured team strategies significantly boost accuracy by acting as error-fi…
The paper proposes a trust schema and verification framework to ensure that agent skills, which augment LLMs, are rigorously verified before deployment, thereby making human-in-the-loop oversight scal…
The paper investigates indirect prompt injection vulnerabilities in ReAct agents by systematically varying the injection depth, payload framing, and turn budget, finding that injection depth is the do…
The paper introduces MAVEN, a lightweight symbolic reasoning scaffold that significantly improves the generalization and end-to-end success rate of large language models in complex, multi-step tool-ca…
Tomer Keren, Nitay Calderon, Asaf Yehudai, Yotam Perlitz +2 more
The paper introduces TASTE, an automatic task synthesis method that generates challenging agent benchmarks by evolving tool sequences, demonstrating that existing benchmarks are saturated and that TAS…
Minyang Hu, Bo Yang, Zhinuo Zhou, Jiachen Liang +3 more
The paper introduces RedundancyBench, a new benchmark for detecting unnecessary steps in LLM agent trajectories, finding that this task is highly complex and difficult to solve.
Jing Peng, Junhao Du, Chenghao Wang, Hanqi Li +20 more
The paper introduces SURE, a unified framework designed to standardize and improve the comparability and reproducibility of evaluations for advanced speech understanding models.
Zhongyu He, Yuanfan Li, Fei Huang, Tianyu Chen +8 more
SIRI introduces a self-internalizing reinforcement learning framework that allows LLM agents to autonomously discover and integrate reusable skills directly into their core policy, significantly impro…
This paper applies the MAP-Elites algorithm to procedurally generate diverse and high-quality First-Person Shooter maps using novel map representations.
The paper proposes SWIM, a novel imitation learning method that can synthesize physically-based swimming motions from a single example, demonstrating superior data efficiency and generalization across…
The paper compares anchorless methods for diversifying LLM-generated idea pools against traditional anchor-dependent methods, finding that semantic direction stratification offers the best balance of…
Tanusree Sharma, Anish Krishnagiri, Lili Dudas, Ahmed Adnan +1 more
The paper introduces V.O.I.C.E, a novel, empirically grounded risk taxonomy that comprehensively models the diverse privacy, security, and governance risks associated with the unconsented synthesis an…
Chih-Heng Chang, Keng-Seng Ho, Chih-Yu Tsai, Kuan-Lin Chen +2 more
AnchorSteer introduces a framework that achieves high-fidelity, structure-preserving music editing by decoupling semantic concept injection from structural constraints.
SkillPager is a novel two-stage framework that efficiently selects minimal, execution-sufficient context from large procedural skill documents by leveraging typed semantic nodes, significantly reducin…
Ioannis Prokopiou, Pantelis Vikatos, Maximos Kaliakatsos-Papakostas, Theodoros Giannakopoulos +1 more
The paper proposes an inference-time activation steering framework, utilizing orthogonalization, to achieve fine-grained, deterministic control over discrete musical attributes like Pitch and Duration…
Hongfei Du, Jiacheng Shi, Sidi Lu, Gang Zhou +1 more
The paper uses sparse autoencoders to identify specific latent features within LLM-based TTS models, enabling interpretable and fine-grained control over emotional expression by intervening in small s…
The paper designed a minimalist BCMI system to translate EEG-measured emotional valence into adaptive music, but preliminary testing showed that frontal alpha asymmetry was not reliably modulated by i…