Gram: Assessing sabotage propensities via automated alignment auditing | ArxivCSExplorer