~ similar to 2605.29931· 16 results
The paper introduces HAIM, a new benchmark dataset designed to move AI music detection beyond simple binary classification by tracking specific stages and types of AI integration in music production.
This study investigates how AI's integration into various workplaces affects employees' job satisfaction by examining changes in perceived job decency and meaningfulness, finding that the impact varie…
The paper argues that current 'on-the-fly' AI agent design lacks necessary software engineering rigor and proposes an 'AI Workflow Store' to provide hardened, reusable, and reliable agent workflows.
The paper introduces AgenticVBench, a comprehensive benchmark of 100 real-world video post-production tasks, and finds that even the best AI agents perform significantly worse than human experts on th…
The study found that while AI collaboration is promising, highly competent and proactive AI systems can negatively impact human perceptions of ownership and job meaningfulness, suggesting that design…
This study surveyed higher education practitioners to map their beliefs and behaviors regarding AI integration, finding that while they view AI favorably, institutional barriers and gaps in design-ori…
Aakash Pant, Kavya Shah, Apoorv Agnihotri, Sneha Nikam +2 more
The paper critiques current AI benchmarking practices for low-resource settings, arguing that evaluation must shift focus from isolated model performance to the holistic performance of the deployed sy…
This paper studies AI development frameworks for software engineering and proposes a six-dimension process taxonomy.
The paper empirically investigates whether AI-generated reviews can improve the drafting process of academic papers, finding that AI reviews cover many human-identified issues but also introduce novel…
This study investigated the stability and prompt-responsiveness of AI tools in classifying the cognitive demand of math tasks, finding that few-shot prompting was a more reliable performance booster t…
Zelin Zhang, Qi Li, Jie Cao, Lingshuang Liu +1 more
The paper analyzes the escalating security and safety threats posed by generative AI systems as they transition from merely generating content to executing real-world actions via tools and agents, fin…
The paper investigates how AI coding assistants shift developers' security focus from proactive prevention to reactive review, finding that this structural change is reinforced by current tool interac…
This study analyzes ClinicalTrials.gov records to track the rising trend of AI in clinical trials and demonstrates that a hybrid human-AI screening approach is viable but requires clearer reporting of…
The study found that while multi-agent LLM code generation architectures significantly affect code complexity, the added complexity does not translate into better functional correctness, suggesting ar…
The paper demonstrates that adopting LLM-based tools in cybersecurity operations requires a sociotechnical, practitioner-centered co-creation approach, which successfully overcame historical adoption…
The paper introduces RADAR, a risk-aware automated code review system, demonstrating that it can significantly reduce review bottlenecks and improve efficiency for AI-generated code without compromisi…