5

I know that in quadratic discriminant analysis (QDA) we use the variance of each class, so is the formula different than that in linear discriminant analysis (LDA)?

Is it $$\frac{1}{N-K} \sum (x - \mu)(x - \mu)^T$$ or $$\frac{1}{N} \sum (x - \mu)(x - \mu)^T,$$

and how can I generate a quadratic boundary equation from this?

amoeba
  • 104,745
datatista
  • 305

2 Answers2

6

In a scenario with $N$ samples and $K$ classes or labels, The first formula should be

$$\frac{1}{N-K} \sum_{c=1}^K \sum_{y_i = c} (x_i - \hat \mu_c) (x_i - \hat \mu_c)^\intercal$$

and is for calculating the pooled variance, to be used if you're tying the covariance matrix across classes (as in LDA). The $N-K$ term arises from Bessel's correction.

If you're not tying the covariance matrices (as in QDA), then the covariance matrix for a class $c$ with $N_c$ samples is

$$\frac{1}{N_c - 1} \sum_{y_i = c} (x_i - \hat \mu_c) (x_i - \hat \mu_c)^\intercal$$

if you want an unbiased estimate of the variance, or

$$\frac{1}{N_c} \sum_{y_i = c} (x_i - \hat \mu_c) (x_i - \hat \mu_c)^\intercal$$

if you want an MSE estimate of the variance.

Either way, usually you don't calculate the equation of the decision boundary in QDA. Given a test point you just evaluate the posterior probability of each class, and pick the highest.

Andy Jones
  • 2,236
2

To answer the second part of your question:

and how can I generate a quadratic boundary equation from this?

You must first understand the difference between LDA and QDA:

  • In Linear Discriminant Analysis (LDA) we assume that the observations within each class are drawn from a multivariate Gaussian distribution with a class-specific mean vector, but a covariance matrix that is common to all $K$ classes.

  • Quadratic Discriminant Analysis (QDA) provides an alternative approach by assuming that each class has its own covariance matrix $\Sigma_k$.

To generate the boundary equation you must know the scoring or discriminant function in the case of QDA. As you don't explicitly ask for the derivation I will state it here as:

$$ \delta_k(x) = \log \pi_k - \frac{1}{2} \log |\Sigma_k| - \frac{1}{2} (x - \mu_k)^T \Sigma_k^{-1} (x - \mu_k) $$

The boundary equation is given by the function of $x$ you obtain by equating the scoring function of two different classes:

$$ \delta_k(x) = \delta_l(x)$$

Which can be very painful... also note that it will be a quadratic function of $x$ - hence the name quadratic discriminant analysis

This is why @Andy advises against calculating the function explicitly - you are better off calculating the score function at each point on a grid (for each class) and then selecting the class that has the highest value. Or you can calculate the probability, which should give the same result