~ similar to 2606.04126· 20 results
The paper introduces CASS-RTL, a novel, model-agnostic framework that enhances the functional correctness of Large Language Models (LLMs) generating Register-Transfer Level (RTL) code by leveraging th…
The paper introduces a data-centric optimization pipeline to improve coding agents' ability to interact with a branching lakehouse, showing significant accuracy gains by treating agent evaluation as a…
StepPRM-RTL is a novel framework that enhances LLM-based RTL code generation for digital hardware designs.
The paper introduces ProofLoop, a novel ReAct agent that uses a solver-in-the-loop approach to automatically generate and formally verify SystemVerilog Assertions (SVA) from natural language specifica…
Jiasheng Zheng, Boxi Cao, Boxi Yu, Yuzhong Zhang +5 more
The paper introduces Atomic Decomposition and Recombination (ADR), a novel framework that generates genuinely novel and challenging verifiable code tasks, significantly improving the scalability of Re…
This study benchmarks token-optimized formats (TOON and TRON) against JSON in end-to-end agentic AI systems, finding that TRON significantly reduces token overhead with minimal performance degradation…
AI-PROPELLER introduces a novel interprocedural code layout optimization system that uses an agentic evolutionary workflow to achieve significant, measurable performance gains in large-scale, real-wor…
The paper introduces FVSpec, a large-scale benchmark that translates thousands of real-world Python property-based tests into formal Lean 4 specifications to evaluate AI models for formal software ver…
The paper introduces Refute-or-Promote, an adversarial multi-agent review system that significantly improves the precision of LLM-assisted defect discovery by filtering out false positives.
Weixing Liu, Zizhen Liu, Jing Ye, Naixing Wang +3 more
FT-Pilot is a novel GNN-guided LLM framework that automatically rewrites RTL code to harden digital circuits against soft errors, providing an efficient, automated path for reliability optimization.
MOSAIC is a novel scheduling framework that significantly accelerates Mixture-of-Agents (MoA) workloads by jointly optimizing expert placement and utilizing confidence-aware adaptive aggregation.
Yipeng Gao, Lei Shu, Genzhi Ye, Xi Xiong +4 more
The paper introduces 3DCodeBench, a systematic benchmark and platform for evaluating Vision-Language Model (VLM) agents' ability to generate procedural 3D models from text and images using code.
The paper introduces Sovereign Agentic Loops (SAL), a control-plane architecture that decouples LLM reasoning from system execution to enhance safety and reliability in real-world AI agents.
The paper argues that current 'on-the-fly' AI agent design lacks necessary software engineering rigor and proposes an 'AI Workflow Store' to provide hardened, reusable, and reliable agent workflows.
This survey reviews the integration of AI and LLMs into hardware security verification, demonstrating its potential to automate complex stages while stressing the necessity of grounding AI outputs in…
Tomer Keren, Nitay Calderon, Asaf Yehudai, Yotam Perlitz +2 more
The paper introduces TASTE, an automatic task synthesis method that generates challenging agent benchmarks by evolving tool sequences, demonstrating that existing benchmarks are saturated and that TAS…
Ruiyin Li, Yiran Zhang, Xiyu Zhou, Yangxiao Cai +5 more
The paper introduces MAAD, a multi-agent framework that autonomously transforms software requirements into comprehensive, multi-view architectural blueprints, significantly improving completeness and…
This paper systematically analyzes the complex design space of hybrid multi-agent systems combining on-device and cloud AI models, finding that the optimal architecture is highly task-dependent and th…
The paper introduces Hyperparam, a set of lightweight JavaScript libraries designed to enable direct, model-aware querying of unstructured data (like agent traces) within client-side AI applications.
This review analyzes the dual impact of integrating Large Language Models (LLMs) into hardware design, detailing both their transformative potential in EDA and the critical security vulnerabilities th…