~ similar to 2605.29490v1· 20 results
Han Dai, Soumyakant Priyadarshan, Abdullah Imran, Ruoyu Wang +1 more
SCRIBE is a novel framework that enables reliable source-level patching of binaries by performing 'binary-aware' recompilation, successfully resolving syntactic and semantic inaccuracies inherent in d…
The paper proposes CoDe-R, a two-stage framework that significantly improves the accuracy and re-executability of decompiled code generated by LLMs, achieving a new SOTA in the lightweight regime.
The paper introduces Decaf, a system that uses automatic feedback and search to significantly improve the semantic correctness and accuracy of neural decompilers, boosting the decompilation rate from…
The paper introduces SCDBench, a comprehensive benchmark dataset and methodology that rigorously evaluates LLM-based smart contract decompilers, finding that while frontier models can produce compilab…
The paper introduces SCDBench, a comprehensive benchmark dataset and methodology that rigorously evaluates LLM-based smart contract decompilers, finding that while frontier LLMs can generate compilabl…
The paper proposes a new binary format that embeds compiler-generated metadata into executables, making the binary structure more transparent and enabling reliable analysis, instrumentation, and recom…
The paper introduces REBench, a comprehensive, standardized benchmark dataset designed to enable fair and rigorous evaluation of Large Language Models (LLMs) on complex binary reverse engineering task…
Elevator is a novel, deterministic binary translator that statically translates entire x86-64 executables to AArch64 by considering all possible interpretations of every byte, eliminating the need for…
The paper introduces LLM4CodeRE, a domain-adaptive LLM framework that significantly improves bidirectional code reverse engineering by unifying assembly-to-source and source-to-assembly translation.
This paper analyzes various source-to-bytecode obfuscation techniques for Erlang, demonstrating that effective protection relies on exploiting the representational gaps between high-level semantics an…
The paper introduces Heimdall, an automated pipeline that uses LLMs and formal verification to safely and automatically migrate legacy, potentially buggy eBPF programs written in C to memory-safe Rust…
Huihui Huang, Jieke Shi, Bo Wang, Zhou Yang +1 more
MemHint is a neuro-symbolic static analysis pipeline that significantly improves memory leak detection in C/C++ by combining LLM semantic understanding with Z3 symbolic reasoning, detecting more leaks…
Jun Zhang, JianYing Qu, Hanwen Du, Zhongkai Sun +2 more
The paper introduces Code-QA-Bench, a novel framework that rigorously separates genuine code reasoning from mere documentation memorization in repository-level code understanding benchmarks.
The paper introduces codebadger, a Model Context Protocol (MCP) server that integrates Joern's Code Property Graph (CPG) with LLMs, enabling large language models to perform large-scale, semantic prog…
Sangjun An, Hyeyeon Park, Yejin Son, Seoksu Lee +1 more
The paper proposes a novel framework to analyze large, obfuscated binaries by decomposing them into structurally coherent units, enabling large-scale dataset generation for LLM-based analysis.
Yuchen Chen, Yuan Xiao, Chunrong Fang, Zhenyu Chen +1 more
DuCodeMark introduces a robust, dual-purpose watermarking technique that embeds ownership signals into code datasets, ensuring protection across both source-code generation and decompilation tasks.
The paper introduces a novel multi-LLM orchestration system combined with symbolic execution to successfully detect memory vulnerabilities in uncompilable, incomplete Rust CVE code snippets, achieving…
The paper introduces PeAR, a static binary rewriting framework that proves static binary instrumentation (SBI) is a practical and effective alternative to dynamic binary instrumentation (DBI) for high…
PUSHAN is a novel, trace-free technique that successfully deobfuscates virtualization-obfuscated binaries, providing complete Control Flow Graphs (CFGs) and high-quality C pseudocode for effective ana…
Chang Liu, Noah Fleischmann, Nicolò Altamura, Edward Raff +2 more
The paper introduces ASSEMBLAGE-DEEPHISTORY, a novel, comprehensive binary dataset that unifies cross-compiler builds, historical versions, and vulnerability labels into a single, queryable structure.