3

I am using XLStat for a PCA of time-series water chemistry data. I have 23 analytes and 29 samples. I am using a correlation matrix for PCA as I find it more interpretable in the context of hydrochemistry. The data is also standardized to a variance of 1 and a mean of 0 to avoid the effect of differing units.

The results of the PCA look great. Very easy to interpret and everything makes a lot of sense. There are numerous significant correlations present in the correlation matrix(alpha=0.5). A KMO sampling adequacy test yields a value of 0.64. The problem is that I keep having an observed chi-squared of "-Inf" for Bartlett's Sphericity Test. Essentially, this means that the chi-squared could not be computed.

  1. What is going on here? This value makes no sense given the strong correlations in the matrix.

  2. Can I continue with PCA despite the failed test?

  3. Could the problem be that by normalizing the data I am imposing normality upon it falsely?

Data:

http://www.filedropper.com/wcrb_1

Matt
  • 31
  • 3
    KMO isn't needed for PCA, actually, it is for factor analysis (see and a link therein). Bartlett's test - hard to say what was wrong without having data (you could show your data, btw). This test is for large sample from normal population (e.g. see). This test is mainly for factor analysis. What might be a reason to use it in the context of PCA as long as PCA is seen as just a data reduction transformation? – ttnphns Sep 26 '14 at 15:10
  • Thanks for the response. I didn't realize KMO was more aimed at factor analysis. The real problem is the failure of Bartlett's Sphericity Test. How do I include my data? – Matt Sep 29 '14 at 14:58
  • If you want to give data, you could publish it in the question body (formatted as code) of leave there a link to an outer file host site. – ttnphns Sep 29 '14 at 18:01
  • I have left a link to a shared file. The data shown is the transformed data and the results of PCA. Thanks for your help! – Matt Sep 29 '14 at 18:19
  • 2
    Thanks for sharing it. Exemplarily done work! I ran PCA in SPSS and confirm every figure except Bartlett's (and contributions / cosines which I didn't check. BTW, how did you compute them?) Now, "my" Bartlett's was: Approx. Chi-square 997.054; df 253; Sig. .00000. SPSS computes the test as written here. Could it be that your program simply considered the determinant of the matrix so close to 0 that it skipped computing the chi-sq value? – ttnphns Sep 29 '14 at 19:00
  • 1
    Wow. Thanks for checking this out and the praise. I appreciate it a lot. That is interesting that SPSS had no problem computing a chi-squared. I wonder if I have uncovered a bug in XLStat? It is nice to have external confirmation of results. I have to claim ignorance in the computation of the contributions and cosines. I checked the box in XLstat and that is what I got. I suppose that is the risk of powerful stats programs in the hands of inexperienced users: too much information and not enough knowledge to handle it properly. – Matt Sep 30 '14 at 21:24
  • 1
    It may have been a bug of XLStat as well as its intended behaviour. As I said, your correlation matrix is virtually singular, but the program might be designed to skip such cases. XLStat may be computing the chi-sq value a bit different way than SPSS does. – ttnphns Oct 01 '14 at 03:19
  • @ttnphns: do you want to post your comment(s) as an answer? Better to have a short answer than no answer at all. Anyone who has a better answer can post it. – Stephan Kolassa Jul 24 '22 at 06:11

1 Answers1

1

Turning my 1st comment into an answer, per advice by @StephanKolassa...

KMO isn't needed for PCA, actually, it is for factor analysis (see and a link therein). Bartlett's test - hard to say what was wrong without having data (you could show your data, btw). This test is for large sample from normal population (e.g. see). This test is mainly for factor analysis. What might be a reason to use it in the context of PCA as long as PCA is seen as just a data reduction transformation?

ttnphns
  • 57,480
  • 49
  • 284
  • 501