Revisiting Padded Transformer Expressivity: Which Architectural Choices Matter and Which Don't | ArxivCSExplorer