ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2605.23643v1· 19 results

cs.CRRecentMay 28, 2026

Bridging Theory and Practice: An Executable Taxonomy of Security Properties for ProVerif and Tamarin

Leonard Tudorache, Ivan Kurtev, Mark van den Brand

The paper introduces a systematic, executable taxonomy of security properties to bridge the gap between theoretical security definitions and their practical implementation in formal verification tools…

View →
cs.CRcs.AIRecentApr 10, 2026

Backdoors in RLVR: Jailbreak Backdoors in LLMs From Verifiable Reward

Weiyang Guo, Zesheng Shi, Zeen Zhu, Yuan Zhou +2 more

This paper introduces a novel backdoor attack (ACB) against Reinforcement Learning with Verifiable Rewards (RLVR), demonstrating that poisoning the training data can implant a backdoor that significan…

View →
cs.CRcs.AIRecentMar 18, 2026

Post-Training Local LLM Agents for Linux Privilege Escalation with Verifiable Rewards

Philipp Normann, Andreas Happe, Jürgen Cito, Daniel Arp

The paper proposes a two-stage post-training pipeline to create a small, local LLM agent (PrivEsc-LLM) capable of performing Linux privilege escalation, achieving high success rates while drastically…

View →
cs.CRcs.LGRecentMar 19, 2026

Towards Verifiable AI with Lightweight Cryptographic Proofs of Inference

Pranay Anchuri, Matteo Campanelli, Paul Cesaretti, Rosario Gennaro +3 more

The paper introduces a lightweight, sampling-based cryptographic protocol for verifiable AI inference that drastically reduces proving overhead from minutes to milliseconds by leveraging statistical p…

View →
cs.AIRecentMay 30, 2026

Certificate-Guided Evaluation of Reinforcement Learning Generalization

Vignesh Subramanian, Đorđe Žikelić, Suguman Bansal

The paper introduces a logic-driven framework using a neural certificate function to rigorously evaluate and benchmark the generalization capabilities of reinforcement learning algorithms on unseen ta…

View →
cs.CRRecentMar 25, 2026

AgentRFC: Security Design Principles and Conformance Testing for Agent Protocols

Shenghan Zheng, Qifan Zhang

The paper introduces a comprehensive security framework, AgentRFC, to systematically analyze and test the security conformance of various AI agent protocols, identifying critical design gaps, especial…

View →
cs.CRcs.AIcs.MARecentMar 23, 2026

STRIATUM-CTF: A Protocol-Driven Agentic Framework for General-Purpose CTF Solving

James Hugglestone, Samuel Jacob Chacko, Dawson Stoller, Ryan Schmidt +1 more

The paper introduces STRIATUM-CTF, a modular agentic framework that uses a standardized context protocol to enable LLMs to perform multi-step, stateful reasoning for general-purpose CTF solving, achie…

View →
cs.SEcs.AIRecentMay 28, 2026

Agora: Toward Autonomous Bug Detection in Production-Level Consensus Protocols with LLM Agents

Xiang Liu, Sa Song, Zhaowei Zhang, Huiying Lan +5 more

The paper introduces Agora, a domain-aware multi-agent framework that successfully detects deep, previously unknown logic bugs in complex consensus protocols, outperforming existing LLM-based analysis…

View →
cs.CRcs.LORecentApr 14, 2026

COBALT-TLA: A Neuro-Symbolic Verification Loop for Cross-Chain Bridge Vulnerability Discovery

Dominik Blain

COBALT-TLA introduces a neuro-symbolic verification loop that successfully and autonomously discovers novel cross-chain bridge vulnerabilities by integrating an LLM with the TLA+ model checker.

View →
cs.CVcs.AIRecentMay 28, 2026

Reinforcement Learning with Robust Rubric Rewards

Ya-Qi Yu, Hao Wang, Fangyu Hong, Xiangyang Qu +14 more

The paper introduces $ ext{RLR}^3$, a novel framework that extends verifiable rewards in Reinforcement Learning to handle partially verifiable, multi-criteria vision-language tasks by integrating robu…

View →
cs.AIcs.CRRecentMar 26, 2026

On the Foundations of Trustworthy Artificial Intelligence

TJ Dunham

The paper proves that platform-deterministic inference is a necessary and sufficient condition for trustworthy AI, establishing that AI trust fundamentally relies on consistent arithmetic.

View →
cs.CRcs.AIcs.MARecentApr 3, 2026

SentinelAgent: Intent-Verified Delegation Chains for Securing Federal Multi-Agent AI Systems

KrishnaSaiReddy Patil

SentinelAgent introduces a formal framework, the Intent-Preserving Delegation Protocol (IPDP), to secure federal multi-agent AI systems by verifying complex delegation chains against seven properties,…

View →
cs.AIcs.LGcs.LORecentMay 29, 2026

Robust Shielding for Safe Reinforcement Learning

Edwin Hamel-De le Court, Thom Badings, Alessandro Abate, Francesco Belardinelli +1 more

The paper introduces a novel shielding framework for Robust MDPs (RMDPs) that guarantees safety under worst-case transition probabilities, enabling safe reinforcement learning even when transition dyn…

View →
cs.AIRecentMay 31, 2026

Before the Model Learns the Bug:Fuzzing RLVR Verifiers

Jaideep Ray

The paper introduces a verifier-fuzzing framework to detect and analyze failure modes in Reinforcement Learning with Verifiable Rewards (RLVR) where bugs in the reward verifier can be exploited by the…

View →
cs.AIcs.CLcs.CRRecentApr 18, 2026

The Cognitive Penalty: Ablating System 1 and System 2 Reasoning in Edge-Native SLMs for Decentralized Consensus

Syed Muhammad Aqdas Rizvi

The paper demonstrates that for edge-native SLMs used in decentralized governance, simpler, intuitive reasoning (System 1) is significantly more robust and efficient than complex, iterative deliberati…

View →
cs.AIcs.CRRecentMay 12, 2026

Do Androids Dream of Breaking the Game? Systematically Auditing AI Agent Benchmarks with BenchJack

Hao Wang, Hanchen Li, Qiuyang Mang, Alvin Cheung +2 more

The paper introduces BenchJack, an automated red-teaming system that systematically audits popular AI agent benchmarks, revealing numerous reward-hacking exploits and demonstrating a method to signifi…

View →
cs.CRcs.AIRecentJun 2, 2026

Learn from Your Mistakes: Tree-like Self-Play for Secure Code LLMs

Wenqi Chen, Ziyan Zhang, Bing Wang, Lin Liu +2 more

The paper introduces Tree-like Self-Play (TSP), a novel framework that treats secure code generation as a fine-grained decision process, significantly improving LLM security by forcing the model to se…

View →
cs.CRcs.AIcs.GTRecentApr 24, 2026

Reconstructive Authority Model: Runtime Execution Validity Under Partial Observability

Marcelo Fernandez - TraslaIA

The paper introduces the Reconstructive Authority Model (RAM), a novel framework that proves execution validity by assessing state coverage rather than just state integrity, showing that existing atte…

View →
cs.AIRecentJun 1, 2026

TRON: Targeted Rule-Verifiable Online Environments for Visual Reasoning RL

Tianze Yang, Yucheng Shi, Ruitong Sun, Jingyuan Huang +2 more

The paper introduces TRON, an online, rule-verifiable environment substrate that generates an unbounded stream of fresh, controllable visual reasoning training instances, significantly improving RL pe…

View →