~ similar to 2605.27849· 20 results
Marko Kojic, Ivan Bondyrev, Aral de Moor, Joseph Shtok +5 more
Mellum 2 is an open-weight 12B Mixture-of-Experts (MoE) language model specialized for software engineering, achieving performance competitive with larger models while maintaining the efficiency of a…
The paper introduces CodeGolf Bench, a novel multi-language benchmark using code golf to measure LLMs' ability to generate highly concise and efficient code, showing that reasoning models significantl…
Yilun Yao, Jiaming Pan, Elsie Dai, Peizhuang Cong +2 more
ConMoE proposes a train-free method for compressing Mixture-of-Experts (MoE) models by consolidating the large expert pool into a smaller set of reusable prototypes and deterministically remapping all…
The paper introduces TRAILS~, a novel method that improves code correctness validation by grounding LLM reasoning in concrete (input, output) pairs derived from specifications, achieving state-of-the-…
This paper systematically studies how soft errors propagate during Large Language Model (LLM) inference using a novel fault-injection framework, providing critical insights and mitigation strategies f…
The paper introduces LLM4CodeRE, a domain-adaptive LLM framework that significantly improves bidirectional code reverse engineering by unifying assembly-to-source and source-to-assembly translation.
Sicheng Feng, Zigeng Chen, Gongfan Fang, Xinyin Ma +1 more
dMoE proposes a block-level Mixture-of-Experts (MoE) framework for Diffusion Large Language Models (dLLMs) that aggregates token-level expert distributions into a unified block-level distribution, sig…
The paper introduces prefix filters and an algorithm (Palla) to systematically learn and apply specific error patterns in Large Language Models, significantly improving constrained generation tasks li…
Hao Wang, Niels Mündler, Mark Vero, Jingxuan He +2 more
The paper introduces SecPI, a fine-tuning pipeline that teaches reasoning language models (RLMs) to autonomously internalize structured security reasoning, significantly improving secure code generati…
The paper proposes projectional decoding, a novel framework that integrates a partial graph model alongside text generation to ensure the semantic validity of LLM-generated software artifacts.
The paper introduces functional entropy, a code-specific uncertainty quantification method, which successfully predicts functional correctness in LLM-generated code by replacing natural language seman…
OverrideFuzz is a novel semantic-aware grammar fuzzer designed to test script-language runtimes by specifically modeling and exploiting complex behaviors like method overriding and dynamic rebinding,…
This paper identifies the 'Format-Reliability Gap'—where LLMs know about code vulnerabilities but generate insecure code anyway—and proposes a localized, per-vulnerability steering vector fix that sig…
Jona te Lintelo, Lichao Wu, Marina Krček, Sengim Karayalçin +1 more
MASCing is a novel framework that enables flexible, non-retraining reconfiguration of Mixture-of-Experts (MoE) models for specific safety objectives by applying activation steering masks to control ex…
The paper introduces FORGE, a feedback-driven execution system that improves LLM-based binary analysis by interleaving reasoning and tool interaction, achieving high-quality vulnerability discovery on…
Guanzhi Deng, Kuan Wu, Haibo Wang, Shing Yin Wong +2 more
The paper introduces RA-MoE, a novel fine-tuning framework that leverages the internal routing structure of Mixture-of-Experts (MoE) models to improve performance on multilingual downstream tasks by a…
Junhyuck Kim, Jihun Yun, Haechan Kim, Gyeongman Kim +2 more
The paper introduces a systematic framework to convert large Mixture-of-Experts (MoE) models into memory-efficient, fully dense architectures, achieving superior performance compared to traditional pr…
Jiasheng Zheng, Boxi Cao, Boxi Yu, Yuzhong Zhang +5 more
The paper introduces Atomic Decomposition and Recombination (ADR), a novel framework that generates genuinely novel and challenging verifiable code tasks, significantly improving the scalability of Re…
The paper empirically evaluates the security quality of LLM-generated code across various prompting methods, finding that while prompting alters the structure of weaknesses, it is insufficient to reli…
Jiarui Feng, Hanqing Zeng, Karish Grover, Ruizhong Qiu +10 more
The paper proposes DAG-MoE, a novel sparse Mixture-of-Experts framework that replaces standard weighted-sum aggregation with structural aggregation to enhance model performance and enable multi-step r…