Lav R. Varshney
3 indexed papers
Publications per year
Top categories
Frequent co-authors
Research Timeline
The paper introduces TraceGuard, a detectability-aware antidistillation method that identifies and poisons 'thought anchors'—sparsely critical sentences—to degrade student model learning without making the defense obvious.
The paper introduces containment verification, a novel method that provides safety guarantees by formally verifying the agentic framework itself, ensuring safety regardless of the underlying AI model's capabilities.
The paper introduces an optimal black-box auditing framework using Donsker-Varadhan estimators to estimate Rényi differential privacy (RDP) guarantees for machine learning algorithms.
Papers
Optimal Guarantees for Auditing Rényi Differentially Private Machine Learning
The paper introduces an optimal black-box auditing framework using Donsker-Varadhan estimators to estimate Rényi differential privacy (RDP) guarantees for machine learning algorithms.