~ similar to 2604.18649v1· 20 results
The paper introduces the concept of 'authenticity debt'—the institutional liability from deploying unverified AI content—and proposes a layered reference architecture combining cryptographic provenanc…
The paper introduces the concept of 'authenticity debt'—the institutional liability from deploying unverified AI content—and proposes a layered reference architecture combining cryptographic provenanc…
Shuning Zhang, Eve He, Xiao Zhan, Shijing He +3 more
This paper investigates how Generative AI enables scalable, hyper-realistic fraud in Chinese e-commerce by fabricating product defect evidence, proposing new defense mechanisms like verifiable materia…
The paper proposes a unified evidentiary framework combining cryptographic provenance, statistical watermarking, and zero-knowledge attestation to address the legal challenges posed by synthetic media…
Yan Liang, Ziyuan Yang, Mengyu Sun, Joey Tianyi Zhou +1 more
The paper proposes SubPopMark, a novel subpopulation-driven framework that injects harmless, verifiable markers into distilled datasets to prevent copyright infringement and data leakage.
Zihan Wang, Rui Zhang, Yu Liu, Chi Liu +3 more
This paper presents the first systematic study of black-box skill stealing attacks against proprietary LLM agents, demonstrating that structured agent skills can be easily extracted, posing a signific…
The paper proposes an attestation-aware promotion gate to mitigate supply-chain risks in LLM pipelines by cryptographically verifying and enforcing claims about training and release artifacts before d…
The paper systematically evaluates various defense mechanisms against persistent memory attacks on LLM agents, finding that only tool-gating at the memory layer (Memory Sandbox) effectively mitigates…
The paper proposes a unified closed-loop threat taxonomy to systematically analyze and defend foundation models by explicitly framing the bidirectional security interactions between data and models.
TimeMark proposes a trustworthy time watermarking framework that uses cryptographic techniques and error-correcting codes to achieve 100% accurate recovery of the generation time from AIGC, resisting…
Bilgehan Sel, Xuanli He, Alwin Peng, Ming Jin +1 more
The paper introduces Trojan-Speak, an adversarial fine-tuning method that successfully bypasses advanced LLM safety classifiers (like Anthropic's Constitutional Classifiers) with minimal degradation t…
ACRFence introduces a framework-agnostic mitigation to prevent semantic rollback attacks in LLM agents by recording irreversible tool effects and enforcing strict replay-or-fork semantics upon checkpo…
This paper introduces the CER framework to address the complex problem of reconstructing AI-mediated losses for insurance claims, moving beyond simple event reconstruction to analyze the system's oper…
Xinlei Guan, David Arosemena, Tejaswi Dhandu, Kuan Huang +6 more
The paper proposes an end-to-end forensic pipeline using steganographic attribution and multimodal harm detection to reliably trace and attribute harmful misuse of AI-generated imagery on social platf…
Shuhao Zhang, Yuli Chen, Jiale Han, Bo Cheng +1 more
The paper proposes Adaptive Stealing (AS), a novel and more robust watermark stealing algorithm that dynamically selects optimal attack perspectives to significantly increase the efficiency of comprom…
The paper introduces a validated, consensus-labeled prompt bank that separates requests for executable malicious code (weapons) from requests for general harmful security knowledge, providing a more g…
The paper introduces Synthesis Data Reversion (SDR), a method that infers the data laundering transformation used in LLM training and synthesizes queries to restore the detection signals lost when pro…
The paper demonstrates that advanced capabilities, such as jailbreaking large language models and finding software vulnerabilities, can be achieved effectively at zero cost by coordinating multiple sm…
This paper proposes a modified SISA framework to achieve efficient class-level unlearning in CNNs, allowing the removal of specific data influence without full model retraining.
The paper proposes an architectural proxy (MCP) to enforce robust, reliable tool access control for LLM agents, demonstrating that this structural enforcement is necessary because prompt-based restric…