My question is clearly related to this one, but my interest is not specifically in Heisenberg's result. To quote from Wikipedia.
A nonzero function and its Fourier transform cannot both be sharply localized at the same time. A similar tradeoff between the variances of Fourier conjugates arises in all systems underlain by Fourier analysis, for example in sound waves: A pure tone is a sharp spike at a single frequency, while its Fourier transform gives the shape of the sound wave in the time domain, which is a completely delocalized sine wave.
Who first explicitly notes the fact stated in the first sentence (A nonzero function and its Fourier transform cannot both be sharply localized at the same time), and who was it that found that the Gaussian distribution makes the trade-off equal in the frequency and time domains?