Cross-modal linkage risk in clinical vision-language models | ArxivCSExplorer