1

I have data that sums to 100 % +/- 5%, and I was reading that data that sums to 100% should be altered before doing PCA. A method that I found others have done is to use the centered log-ratio (clr) transformation.

I'm not sure if I should do this on my data set, since some samples have a sum of 105%, but an increase in one value causes a decrease in another value, so that the sum is near 100%.

Let me know if anyone has any feedback.

Thanks!

Stefg7
  • 11
  • I'll let others comment, but this seems to me to be a nonstandard case such that trying it both ways and seeing how it influences the results is the only way to proceed. – John Madden Jan 12 '24 at 17:39
  • 2
    Please explain how this constraint arises. It could make a difference whether there's an underlying sum-to-unity constraint (and this reflects measurement errors) or if there's no actual constraint. Whether "data ... should be altered" is a matter for investigation, not as a general rule blindly to be followed. There are some general principles you might find helpful, some of which are followed at https://stats.stackexchange.com/a/259223/919 and are likely applicable to your analyses. – whuber Jan 12 '24 at 17:39

0 Answers0