ROAST: Risk-aware Outlier-exposure for Adversarial Selective Training of Anomaly Detectors Against Evasion Attacks
ROAST is a risk-aware selective training framework that improves anomaly detector recall against evasion attacks by focusing training on less vulnerable patients, significantly reducing false negatives.
Abstract
More Like ThisSafety-critical domains like healthcare rely on deep neural networks (DNNs) for prediction, yet DNNs remain vulnerable to evasion attacks. Anomaly detectors (ADs) are widely used to protect DNNs, but conventional ADs are trained indiscriminately on benign data from all patients, overlooking physiological differences that introduce noise, degrade robustness, and reduce recall. In this paper, we propose ROAST, a novel risk-aware outlier exposure (OE) selective training framework that improves AD recall while largely preserving precision. ROAST identifies patients who are less vulnerable to attack and focuses training on these cleaner, more reliable data, thereby reducing false negatives and improving recall. To preserve precision, the framework applies OE by injecting adversarial samples into the training set of the less vulnerable patients, avoiding noisy data from others. Experiments show that ROAST increases recall by 16.2\% (black-box attack setting) and 5.89\% (white-box attack setting) on average while reducing the training time by 88.3\% on average compared to indiscriminate training, with minimal impact on precision.