Temporally-Aligned Evaluation for Audio-Driven Talking Head Generation | ArxivCSExplorer