0

Is it possible to use more than one categorical dependent variable with partial least square discriminant analysis?

Thanks.

Patsy
  • 1

1 Answers1

1

Short answer: Yes.

Long answer: If there are more than 2 classes, then the number dependent variables should match the number of classes and should be "one-hot-encoded". So, let's say you have a total of 6 samples and 3 classes namely C1, C2, and C3 each having 2 samples. Then the dependent variables should look like this:

C1 C2 C3
-- -- --
1  0  0
1  0  0
0  1  0
0  1  0
0  0  1
0  0  1

The 1st and 2nd sample belongs to C1 thus they take 1 on the corresponding column and 0 on the others. Similarly, the 3rd and 4th samples belongs to C2 and take 1 on the corresponding C2 column where 1 is placed and the rest is filled with 0's and so on...

It should be noted that there is an exception, when there is only 2 groups, only one column (in other words a single dependent variable) is enough as the other one will be complimentary.

C1 C2  =  C
-- --     --
1  0      1
1  0      1
1  0      1
0  1      0
0  1      0
0  1      0

These two will yield same results since sum of each row will yield 1 for PLS-DA algorithm.

As a final note, for prediction, usually the class (corresponding column) with the highest value is the predicted class for a sample.

gunakkoc
  • 1,532
  • 1
  • 12
  • 23