I am learning about the idea of correlation in statistics, and I came across the following
Statement: the best fit line of bivariate normal data passes through extrema of level sets. That is, if $(X,Y)\sim \mathcal{N}(0,\Sigma)$, then the best fit line $l$ passes through the extrema of the level sets of $f$, the PDF of $(X,Y$).
For example, in the following picture, if the red line is a level set of $f$, then $l$ would pass through the two (black) labeled points.
The best fit line $l$ is defined to minimise the sum of the squares of the horizontal distances of the data points to $l$. Let's say $l$ fits a large number of data points that are i.i.d $(X,Y)$.
The PDF is $f(x,y) = \frac{1}{\sqrt{(2\pi)^2|\Sigma|}}\exp\left(-\frac12(x\ y)\Sigma\binom{x}{y}\right)$ and the margin is (e.g.) $f(x) =\int_{-\infty}^{\infty}f(x,y)dy$.
Problem: How to prove the statement? I know you can find $l$ with calculus but how does it relate to the extrema of the level sets? Thanks.


