Learning to do Linear Discriminant Analysis with sklearn, and a bit confused about the scalings_ attribute of the fitted model. The LDA classifier can be written as
$$\delta_k(x)=x^{T}\Sigma^{-1} \mu_k + C_k,$$
with $C_k$ a constant not dependent on $x$. The documentation says the scalings are the linear combinations of the predictors, so I assumed this was the $\Sigma^{-1} \mu_k$ term.
But, after fitting a LDA classifier lda.fit(X_train,Y_train) and comparing
np.lingalg.inv(lda.covariance_) @ lda.means_
lda.scalings_,
they are different, and I can see that I was wrong as the two results aren't the same. In fact, the first result is a $2 \times 2$ matrix, and the second result is $1 \times 2$ row vector. So, not sure where my misunderstands lie. Thank you for reading.