ArXivCSExplorer
☆☆Bookmarks🏆RSSHow to UseFAQ
Built with and by Teycir Ben Soltane•
How to Use•FAQ•GitHub•arXiv.org•
Share:

~ similar to 2604.17093v1· 19 results

cs.CRcs.ARcs.LGRecentMay 11, 2026

LLMs for Secure Hardware Design and Related Problems: Opportunities and Challenges

Johann Knechtel, Ozgur Sinanoglu, Ramesh Karri

This review analyzes the dual impact of integrating Large Language Models (LLMs) into hardware design, detailing both their transformative potential in EDA and the critical security vulnerabilities th…

View →
cs.CRcs.ARRecentApr 29, 2026

SafeTune: Mitigating Data Poisoning in LLM Fine-Tuning for RTL Code Generation

Mahshid Rezakhani, Nowfel Mashnoor, Kimia Azar, Hadi Kamali

SafeTune is a framework that enhances the robustness of LLMs fine-tuned for RTL code generation by detecting and mitigating data poisoning attacks, particularly those aiming to insert hardware Trojans…

View →
cs.CRRecentMar 18, 2026

SoK: From Silicon to Netlist and Beyond $-$ Two Decades of Hardware Reverse Engineering Research

Zehra Karadağ, Simon Klix, René Walendy, Felix Hahn +4 more

This paper systematizes two decades of hardware reverse engineering research by analyzing 187 publications, identifying key technical methods and recommending improvements for reproducibility, standar…

View →
cs.CRcs.LGRecentApr 22, 2026

Breaking Bad: Interpretability-Based Safety Audits of State-of-the-Art LLMs

Krishiv Agarwal, Ramneet Kaur, Colin Samplawski, Manoj Acharya +5 more

The paper conducts an interpretability-driven safety audit of eight state-of-the-art LLMs, demonstrating that while interpretability-based steering is a powerful auditing tool, model robustness varies…

View →
cs.CRcs.CLRecentMar 25, 2026

Analysing the Safety Pitfalls of Steering Vectors

Yuxiao Li, Alina Fastowski, Efstratios Zaradoukas, Bardh Prenkaj +1 more

This paper systematically audits the safety implications of activation steering vectors, finding that these vectors significantly influence the success rate of jailbreak attacks by overlapping with la…

View →
cs.CRcs.LGRecentMay 28, 2026

Dissecting the Black Box: Circuit-Level Analysis of LLM Vulnerability Detection

Syafiq Al Atiiq, Chun Zhou, Christian Gehrmann

The paper analyzes LLM vulnerability detection using mechanistic interpretability, finding that models primarily rely on safety detectors rather than direct vulnerability signature recognition.

View →
cs.CRRecentMar 22, 2026

Hardware Trojans from Invisible Inversions: On the Trojanizability of Standard Cell Libraries

Kolja Dorschel, René Walendy, Lukas Plätz, Thorben Moos +2 more

The paper analyzes existing hardware Trojan datasets to demonstrate that standard cell libraries can be systematically exploited to create visually undetectable, stealthy hardware Trojans, exemplified…

View →
cs.CRRecentApr 14, 2026

Can Agents Secure Hardware? Evaluating Agentic LLM-Driven Obfuscation for IP Protection

Sujan Ghimire, Parsa Mirfasihi, Muhtasim Alam Chowdhury, Veeramani Pugazhenthi +5 more

This paper introduces an agentic LLM-driven framework that automates the generation of functionally correct and security-relevant hardware netlist obfuscation for protecting intellectual property.

View →
cs.CRcs.AIRecentApr 18, 2026

SafeDream: Safety World Model for Proactive Early Jailbreak Detection

Bo Yan, Weikai Lin, Yada Zhu, Song Wang

SAFEDREAM introduces a lightweight, external world-model framework that proactively detects multi-turn jailbreak attacks by modeling cumulative safety erosion and predicting early failure points.

View →
cs.CRcs.CLRecentMay 11, 2026

LITMUS: Benchmarking Behavioral Jailbreaks of LLM Agents in Real OS Environments

Chiyu Zhang, Huiqin Yang, Bendong Jiang, Xiaolei Zhang +7 more

The paper introduces LITMUS, a novel benchmark that rigorously tests LLM agents for dangerous, physical-layer behavioral jailbreaks in real OS environments, revealing that current agents frequently ex…

View →
cs.CRcs.CLRecentApr 14, 2026

Compiling Activation Steering into Weights via Null-Space Constraints for Stealthy Backdoors

Rui Yin, Tianxu Han, Naen Xu, Changjiang Li +7 more

The paper proposes a novel method to inject reliable, sustained backdoors into LLMs by compiling an activation steering vector into model weights, ensuring the backdoor only activates upon a specific…

View →
cs.CRRecentApr 16, 2026

Emulation-based System-on-Chip Security Verification: Challenges and Opportunities

Tanvir Rahman, Shuvagata Saha, Ahmed Y. Alhurubi, Sujan Kumar Saha +2 more

This paper surveys the use of hardware emulation for security verification in System-on-Chip (SoC) design, positioning emulation as a critical, high-fidelity pre-silicon assurance technology.

View →
cs.CRRecentApr 1, 2026

When Safe Models Merge into Danger: Exploiting Latent Vulnerabilities in LLM Fusion

Jiaqing Li, Zhibo Zhang, Shide Zhou, Yuxi Li +2 more

The paper introduces TrojanMerge, a framework demonstrating that model merging can be exploited to systematically compromise the safety alignment of multiple individually safe LLMs.

View →
cs.CRRecentApr 2, 2026

AI-Assisted Hardware Security Verification: A Survey and AI Accelerator Case Study

Khan Thamid Hasan, Md Ajoad Hasan, Nashmin Alam, Md. Touhidul Islam +2 more

This survey reviews the integration of AI and LLMs into hardware security verification, demonstrating its potential to automate complex stages while stressing the necessity of grounding AI outputs in…

View →
cs.CRRecentApr 2, 2026

Assertain: Automated Security Assertion Generation Using Large Language Models

Shams Tarek, Dipayan Saha, Khan Thamid Hasan, Sujan Kumar Saha +2 more

Assertain is an automated framework that uses large language models and design analysis to generate high-quality, executable security assertions for hardware designs, significantly outperforming state…

View →
cs.CRcs.SERecentMay 15, 2026

Compositional Jailbreaking: An Empirical Analysis of Mutator Chain Interactions in Aligned LLMs

Reinelle Jan Bugnot, Soohyeon Choi, Hoon Wei Lim, Yue Duan

This paper systematically analyzes the interaction of multiple weak jailbreak attacks (mutators) applied sequentially to LLMs, finding that most combinations fail due to destructive interference, reve…

View →
cs.CRcs.AIRecentMay 4, 2026

When Alignment Isn't Enough: Response-Path Attacks on LLM Agents

Mingyu Luo, Zihan Zhang, Zesen Liu, Yuchong Xie +6 more

This paper introduces the Relay Tampering Attack (RTA), demonstrating that malicious third-party relays can undermine the security of LLM agents by modifying responses post-alignment, even if the LLM…

View →
cs.CRcs.AIRecentMay 19, 2026

Exploring and Developing a Pre-Model Safeguard with Draft Models

Hongyu Cai, Arjun Arunasalam, Yiming Liang, Antonio Bianchi +1 more

The paper proposes a novel pre-model safeguard that uses small draft models (SLMs) to predict the safety of prompts, significantly reducing false-negative rates while maintaining low computational ove…

View →
cs.CRcs.LORecentApr 14, 2026

COBALT-TLA: A Neuro-Symbolic Verification Loop for Cross-Chain Bridge Vulnerability Discovery

Dominik Blain

COBALT-TLA introduces a neuro-symbolic verification loop that successfully and autonomously discovers novel cross-chain bridge vulnerabilities by integrating an LLM with the TLA+ model checker.

View →