When Data Is Scarce: Scaling Sparse Language Models with Repeated Training | ArxivCSExplorer