The Importance of Phase in Neural Representations: An Internal Oppenheim-Lim Test of Image Classifiers
This paper investigates whether deep learning models retain the phase/sign asymmetry of natural images in their hidden layers and tests it causally.
Provided a mechanistic account of the texture--shape gap between CNNs and attention models by revealing the shared phase/sign identity code in different architectures.
Before reading this…
Applications
- →Computer vision, image recognition, deep learning models
To understand this paper, make sure you know these concepts first:
- Understanding of deep learning models, image recognition, Fourier transformfind papers →
Abstract
More Like ThisOppenheim and Lim (1981) showed that natural images stay recognizable when reconstructed from their Fourier phase alone, while the magnitude carries little of their identity. We ask whether trained image classifiers reproduce this asymmetry inside their hidden layers, and we test it causally: given two images, we transplant the phase of one onto the magnitude of the other at a chosen layer and record which image the prediction follows. In PRISM2D, GFNet, and ViT-B/16 the prediction follows the phase or sign donor, and deleting all image-specific magnitude barely moves accuracy, so identity rides on phase while image-specific magnitude is largely dispensable to the readout. ResNet-50 at first seems to break the pattern, because transplanting sign after its ReLUs does nothing; a fair intervention before the ReLU reveals a strong latent sign code in the late blocks, and a DC-only control shows the readout consumes a channel-wise spatial average. Controls rule out the trivial case in which magnitude simply stops depending on the image. The architectures therefore share a phase/sign identity code but expose it in different bases, set by rectification and readout geometry, which gives a mechanistic account of the texture--shape gap between CNNs and attention models.