Benchmarking AI for low-resource contexts: Thinking beyond leaderboards | ArxivCSExplorer