ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

20 results for “pitch”

CS papers only

Hybrid search: Keyword + semantic, ranked by combined score.ⓘ

Want pure semantic search? Try claim verification →

cs.SDEmpiricalRecentJun 12, 2026

Instantaneous Pitch Estimation via Wave-U-Net-Based Fundamental Waveform Enhancement

Junya Koguchi, Tomoki Koriyama

A Wave-U-Net model is trained to extract a fundamental waveform from input speech signals for accurate and robust instantaneous pitch estimation.

View →
eess.ASEmpiricalRecentJun 12, 2026

Unsupervised Approaches for Global Prosodic Embedding Extraction

Martin Meza, Luciana Ferrer, Pablo Riera

The paper proposes methods for generating global prosodic embeddings using auto-encoder models of pitch and energy, demonstrating competitive or superior performance under challenging conditions.

View →
cs.CLRecentMay 28, 2026

Can LLM Teams Play What? Where? When?

Anastasia Kotelnikova, Viktor Byzov, Maria Dolzhenkova, Evgeny Kotelnikov

This paper investigates if team-based interaction improves LLM performance on complex reasoning tasks (ChGK), finding that structured team strategies significantly boost accuracy by acting as error-fi…

View →
cs.CRcs.AIcs.MARecentMay 1, 2026

Skills as Verifiable Artifacts: A Trust Schema and a Biconditional Correctness Criterion for Human-in-the-Loop Agent Runtimes

Alfredo Metere

The paper proposes a trust schema and verification framework to ensure that agent skills, which augment LLMs, are rigorously verified before deployment, thereby making human-in-the-loop oversight scal…

View →
cs.CRcs.AIcs.LGRecentMay 29, 2026

Depth-Dependent Indirect Prompt Injection in Tool-Calling ReAct Agents: Injection Depth, Payload Framing, and Turn-Budget Sensitivity

Mohammadreza Rashidi

The paper investigates indirect prompt injection vulnerabilities in ReAct agents by systematically varying the injection depth, payload framing, and turn budget, finding that injection depth is the do…

View →
cs.AIRecentMay 29, 2026

MAVEN: Improving Generalization in Agentic Tool Calling

Omkar Ghugarkar, Vishvesh Bhat, Muhammad Ahmed Mohsin, Asad Aali

The paper introduces MAVEN, a lightweight symbolic reasoning scaffold that significantly improves the generalization and end-to-end success rate of large language models in complex, multi-step tool-ca…

View →
cs.AIRecentMay 27, 2026

A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks

Tomer Keren, Nitay Calderon, Asaf Yehudai, Yotam Perlitz +2 more

The paper introduces TASTE, an automatic task synthesis method that generates challenging agent benchmarks by evolving tool sequences, demonstrating that existing benchmarks are saturated and that TAS…

View →
cs.AIRecentMay 28, 2026

Redundant or Necessary? A Benchmark for Detecting Redundant Steps in Agent Trajectories

Minyang Hu, Bo Yang, Zhinuo Zhou, Jiachen Liang +3 more

The paper introduces RedundancyBench, a new benchmark for detecting unnecessary steps in LLM agent trajectories, finding that this task is highly complex and difficult to solve.

View →
eess.AScs.AIcs.SDRecentMay 29, 2026

A Unified and Reproducible Experimentation Framework for Speech Understanding

Jing Peng, Junhao Du, Chenghao Wang, Hanqi Li +20 more

The paper introduces SURE, a unified framework designed to standardize and improve the comparability and reproducibility of evaluations for advanced speech understanding models.

View →
cs.AIcs.LGRecentJun 1, 2026

SIRI: Self-Internalizing Reinforcement Learning with Intrinsic Skills for LLM Agent Training

Zhongyu He, Yuanfan Li, Fei Huang, Tianyu Chen +8 more

SIRI introduces a self-internalizing reinforcement learning framework that allows LLM agents to autonomously discover and integrate reusable skills directly into their core policy, significantly impro…

View →
cs.AIRecentMay 28, 2026

Procedural Generation of First Person Shooter Maps using Map-Elites

Simone de Donato, Pier Luca Lanzi, Daniele Loiacono

This paper applies the MAP-Elites algorithm to procedurally generate diverse and high-quality First-Person Shooter maps using novel map representations.

View →
cs.GRcs.AIcs.LGRecentMay 29, 2026

SWIM: Single-Instance Whole-Body Imitation for swiMming

Binglun Wang, Edmond S. L. Ho, He Wang

The paper proposes SWIM, a novel imitation learning method that can synthesize physically-based swimming motions from a single example, demonstrating superior data efficiency and generalization across…

View →
cs.AIRecentMay 28, 2026

Anchorless Diversification for Parallel LLM Ideation

Fares Nabil Ibrahim, Nafis Saami Azad, Raiyan Abdul Baten

The paper compares anchorless methods for diversifying LLM-generated idea pools against traditional anchor-dependent methods, finding that semantic direction stratification offers the best balance of…

View →
cs.CRcs.AIcs.CYRecentApr 25, 2026

V.O.I.C.E (Voice, Ownership, Identity, Control, Expression): Risk Taxonomy of Synthetic Voice Generation From Empirical Data

Tanusree Sharma, Anish Krishnagiri, Lili Dudas, Ahmed Adnan +1 more

The paper introduces V.O.I.C.E, a novel, empirically grounded risk taxonomy that comprehensively models the diverse privacy, security, and governance risks associated with the unconsented synthesis an…

View →
cs.SDcs.AIRecentMay 29, 2026

AnchorSteer: Self-Discovered Concept Injection for Structure-Preserving Music Editing

Chih-Heng Chang, Keng-Seng Ho, Chih-Yu Tsai, Kuan-Lin Chen +2 more

AnchorSteer introduces a framework that achieves high-fidelity, structure-preserving music editing by decoupling semantic concept injection from structural constraints.

View →
cs.IRcs.AIRecentMay 30, 2026

SkillPager: Query-Adaptive Intra-Skill Navigation via Semantic Node Retrieval

Zicai Cui, Zihan Guo, Weiwen Liu, Weinan Zhang

SkillPager is a novel two-stage framework that efficiently selects minimal, execution-sufficient context from large procedural skill documents by leveraging typed semantic nodes, significantly reducin…

View →
cs.SDcs.AIcs.IRRecentMay 29, 2026

Latent Space Disentanglement via Activation Steering for Interpretable Attribute Control in Symbolic Music Generation

Ioannis Prokopiou, Pantelis Vikatos, Maximos Kaliakatsos-Papakostas, Theodoros Giannakopoulos +1 more

The paper proposes an inference-time activation steering framework, utilizing orthogonalization, to achieve fine-grained, deterministic control over discrete musical attributes like Pitch and Duration…

View →
cs.CLRecentMay 31, 2026

Sparse Autoencoders for Interpretable Emotion Control in Text-to-Speech

Hongfei Du, Jiacheng Shi, Sidi Lu, Gang Zhou +1 more

The paper uses sparse autoencoders to identify specific latent features within LLM-based TTS models, enabling interpretable and fine-grained control over emotional expression by intervening in small s…

View →
cs.AIcs.HCRecentMay 31, 2026

A Minimalist Brain-Computer Musical Interface for Real-Time Emotion-Driven Sonification: System Design and Preliminary Evaluation

Pablo A. Monroy-D'Croz, Rafael Ramirez-Melendez, Julian Cespedes-Guevara

The paper designed a minimalist BCMI system to translate EEG-measured emotional valence into adaptive music, but preliminary testing showed that frontal alpha asymmetry was not reliably modulated by i…

View →