Prog. Synthesis
Automated programming, code generation, and program induction
20 papers indexed
Combinatorial Synthesis: Scaling Code RLVR via Atomic Decomposition and Recombination
Jiasheng Zheng, Boxi Cao, Boxi Yu, Yuzhong Zhang +5 more
The paper introduces Atomic Decomposition and Recombination (ADR), a novel framework that generates genuinely novel and challenging verifiable code tasks, significantly improving the scalability of Re…
Bridging the Sim-to-Real Gap in Semiconductor Visual Program Synthesis via Input Binarization
Yusuke Ohtsubo, Kota Dohi, Koichiro Yawata, Koki Takeshita +1 more
The paper proposes a visual program synthesis framework using a VLM to generate accurate training data for semiconductor inspection, mitigating the sim-to-real gap by applying input binarization to st…
An Empirical Evaluation of LLM-Generated Code Security Across Prompting Methods
The paper empirically evaluates the security quality of LLM-generated code across various prompting methods, finding that while prompting alters the structure of weaknesses, it is insufficient to reli…
False Security Confidence in Benign LLM Code Generation
The paper introduces False Security Confidence (FSC), a new metric to measure the inherent prevalence of security vulnerabilities in code generated by LLMs that are otherwise functionally correct, eve…
Detecting Privilege Escalation in Polyglot Microservices via Agentic Program Analysis
The paper introduces Neo, an agentic program analysis framework that successfully detects zero-day privilege escalation vulnerabilities in complex, polyglot microservices by combining LLMs with advanc…
ALPS: Automated Least-Privilege Enforcement for Securing Serverless Functions
ALPS is an automated, vendor-agnostic framework that enforces least privilege in serverless functions by analyzing code and generating precise security policies, achieving high coverage and significan…
Evolving Features vs Evolving Entire Trees with GP for Interpretable Survival Analysis
This paper proposes using genetic programming (GP) to jointly evolve both the feature sets and the structure of survival trees, resulting in highly interpretable and high-performing shallow models for…
FVSpec: Real-World Property-Based Tests as Lean Challenges
The paper introduces FVSpec, a large-scale benchmark that translates thousands of real-world Python property-based tests into formal Lean 4 specifications to evaluate AI models for formal software ver…
Vulnerability Abundance: A formal proof of infinite vulnerabilities in code
The paper provides a formal proof that a single C program can contain a countably infinite number of distinct, independently assignable software vulnerabilities, suggesting the set of all software vul…
Inferring Code Correctness from Specification
The paper introduces TRAILS~, a novel method that improves code correctness validation by grounding LLM reasoning in concrete (input, output) pairs derived from specifications, achieving state-of-the-…
Grid Programs: A Two-Dimensional, Variable-Free Model of Computation
The paper introduces Grid Programs, a novel, Turing-complete model of computation where programs are two-dimensional arrangements of instructions, fundamentally departing from linear code structures.
MOSAIC: Modular Orchestration for Structured Agentic Intelligence and Composition
MOSAIC introduces a structured agentic framework that treats automated data science as a staged, context-grounded model selection problem, improving performance and traceability over traditional AutoM…
Evolutionary Discovery of Bivariate Bicycle Codes with LLM-Guided Search
The paper introduces an LLM-guided evolutionary workflow that successfully discovers and certifies a large number of novel bivariate quantum error-correcting codes, demonstrating the utility of LLMs i…
Healthcare Mechanisms from Policy-as-Code Search under Strategic Provider Response
The paper models healthcare mechanism design as program synthesis, demonstrating that an optimized, mixed-objective program can eliminate up-coding and reduce patient rejection while maintaining finan…
LACUNA: Safe Agents as Recursive Program Holes
Yaoyu Zhao, Yichen Xu, Oliver Bračevac, Cao Nguyen Pham +2 more
The paper introduces LACUNA, a novel programming model that allows LLM agents to write code that shapes the runtime environment while maintaining strong type-checking safety guarantees.
SRTJ: Self-Evolving Rule-Driven Training-Free LLM Jailbreaking
Jindong Li, Ying Liu, Yali Fu, Jinjing Zhu +3 more
The paper proposes SRTJ, a Self-Evolving Rule-Driven Training-Free Jailbreak framework that systematically discovers and refines attack strategies using rule composition and feedback to achieve robust…
Adaptive and AI-Augmented Security Testing: A Systematic Survey of Program Analysis, Feedback-Driven Testing, and Hybrid Learning-Based Approaches
This paper systematically surveys adaptive and AI-augmented security testing, concluding that a major gap exists—structural-adaptive fragmentation—where current systems fail to integrate structural pr…
Memory Forensics Techniques for Automated Detection and Analysis of Go Malware
The paper introduces a novel memory forensics framework to perform runtime analysis of Go malware, successfully recovering critical execution state and artifacts that are invisible to traditional stat…
Cookie-Bench: Continuous On-screen Key Interaction Evaluation for Web Generation
Haoyue Yang, Zhangxiao Shen, Fan Ding, Hangting Lou +7 more
The paper introduces Cookie-Bench, a novel, autonomous, and reference-free evaluation framework that significantly improves the assessment of interactive web generation capabilities for frontier LLMs.
LogicEval: A Systematic Framework for Evaluating Automated Repair Techniques for Logical Vulnerabilities in Real-World Software
Syed Md Mukit Rashid, Abdullah Al Ishtiaq, Kai Tu, Yilu Dong +6 more
The paper introduces LogicEval, a systematic framework and dataset (LogicDS) to evaluate automated repair techniques for logical software vulnerabilities, finding that prompt sensitivity and context l…