0

I wish to create one operational performance score for each day so that we know if the performance was good or bad, to be able to compare days, and to be proactive in maximizing performance. I did principal components analysis on the eight variables of success that I have. I can see three components that can describe 80% of the variations. I then used SAS to calculate a score for each principal component for each day. My question: can I add the three generated scores for each day into one performance score? if not then how would I get one performance measure score (my variables do not load well on just one principal components, only 30% variability on first component). Thank you for your help.

  • 1
    Not a good idea. The main point of PCA here is essentially that PC1 is the best single summary. You won't improve on it by mixing it with other components. – Nick Cox Mar 10 '16 at 18:04
  • Thank you Nick. Why do you think it would not improve by adding them up? and how would you recommend I combine them into one score? Thanks again. – Tarek Soukieh Mar 10 '16 at 18:10
  • @TarekSoukieh Check the definition of PCA, and you will see why. If you're still having trouble, look at the geometric visualization explanation of what PCA is. – rocinante Mar 10 '16 at 19:11
  • Then what do you recommend as a solution? Thanks. – Tarek Soukieh Mar 10 '16 at 21:11
  • @Nick Although that was my first thought, too, upon further reflection it seemed to me there could be merit in this approach. If the three components are interpretable as meaningful performance-related factors, and if the performance is considered to be some function of those three factors, then it might make a great deal of sense to do something like add the three factors. This shows the importance of understanding which aspects of this question are mathematical, which are statistical, and which involve some form of valuation. – whuber Mar 10 '16 at 22:17
  • @whuber Then you would need to be more than usually careful about the arbitrary signs of the results, would you not? – Nick Cox Mar 11 '16 at 07:37
  • @Nick Yes, that's right. But presumably that would be part of what it means for a component to be "interpretable." – whuber Mar 11 '16 at 14:04
  • @NickCox: What do you mean by "careful about the arbitary signs of the results"? The variable "Abandonment Rate" has a negative loading on the principal component and I think that SAS takes that into account when calculating the score for each component. Is there anything else I need to do? whuber: I am planning on using this performance score as a dependent variable in my forecasting procedure, would that be acceptable as well? Thank you both very much. – Tarek Soukieh Mar 11 '16 at 18:40
  • You got one set of results from SAS on one occasion, but reversals of sign are perfectly expectable in general. Much asked here e.g. http://stats.stackexchange.com/questions/88880/does-the-sign-of-pca-or-fa-components-have-a-meaning – Nick Cox Mar 11 '16 at 18:49
  • The question in the link says that we should not be too worried about the sign of each variable loading on each component, while you are saying I should be worried about the signs, can you please explain? @whuber do you think I can use this principal component as a dependent variable in my forecasting model? – Tarek Soukieh Mar 11 '16 at 19:16
  • If you don't take care of the sign, you could end of subtracting where you intended to add or vice versa. Although the sign is irrelevant for determining a linear subspace, it is essential for interpretation! – whuber Mar 11 '16 at 19:58
  • Thank you @whuber. That is my understanding. If a variable loads negatively on a principal component, then it has negative influence on the final score of that component for each record. – Tarek Soukieh Mar 11 '16 at 21:15
  • Even if you intend only to do this yourself in-house with some particular software, the fact that results could be different with other software should be of concern to other people interested in the method and if you intend to publish the method or its results. – Nick Cox Mar 12 '16 at 10:12
  • Someone suggested that I use rank order to create a performance score. Basically, for each variable I create a LSM coefficient then rank days based on that coefficient and give each day a rank number, do that for each of the variables, then for each day add all its rank numbers for all variables to get a performance score. Is this a better approach than PCA? why? @whuber Thank you all for your help. – Tarek Soukieh Apr 01 '16 at 17:08
  • What's an "LSM coefficient"? – whuber Apr 01 '16 at 17:19
  • Sorry, I meant least squares means estimate. – Tarek Soukieh Apr 01 '16 at 17:23

0 Answers0