This is my first post so apologies for any incorrect formatting or whether this has been answered elsewhere but I seem to be going around in circles.
Basically, I have 12 survey plots and have recorded 17 variables for each; # of tree species, # of dead trees, % grass cover etc. I have 2 years worth of this data (2016, 2017) so would be running each separately. The plan is to use PCA in R to reduce the number of variables by using the component scores from the top principle components instead. I ran PCA using PRCOMP as follows:
dframe1 <- read.csv('g:/veg.csv', header=TRUE)
PCA.results <- prcomp(dframe1, center = TRUE, scale. = TRUE)
The first 5 PC's have eignenvalues >1 so I obtained the component scores for these (abbreviated results shown):
PCA.results$x
PC1 PC2 PC3 PC4 PC5
[1,] -2.7329607 -0.3238917 -1.2887333 0.15997834 1.0115736
[2,] -0.4176688 -2.6465327 -2.4567818 1.17885072 0.130746
[3,] -0.2304915 -1.8657283 -0.4056321 -0.12534494 -1.6435601
[4,] -4.2221891 1.860162 0.5397799 -0.19361945 -1.2656926
[5,] -3.0834 1.7658483 -0.1064903 -1.02139467 0.9627706
I have read about using rotations such a varimax or oblimin to better separate your components. To check which one to use, I have run an oblimin rotation to see if the any factors are above 0.32 in the correlation matrix.
library(psych)
library(GPArotation)
my.oblimin <- fa(PCA.results$rotation, nfactors=5, rotate = "oblimin")
None of them were, so it looks as if orthogonal (varimax) rotation should be okay to use although I did get an error message for one year, the other didn't have any errors, that said "The estimated weights for the factor scores are probably incorrect. Try a different factor extraction method."
If I do decide to run a varimax rotation on my PCA results, how do I then get the new component scores? It doesn't seem possible to specify a rotation type in PRCOMP so you have to run the rotation afterwards. This will only give you the rotated loadings, i.e. variables against PCs but no component scores, whether I use prcomp or fa to perform the rotation:
r.varimax <- varimax(PCA.results$rotation[,1:5])
or
fa.varimax <- fa(PCA.results$rotation, nfactors=5, rotate = "varimax")
So, my questions are:
Does any/none/all of this seem reasonable?
Is it possible to obtain new component scores for my 12 sites using varimax rotated PC values and would you want to?
What does error message about estimated weights refer to?
If the answer to 1 or 2 is No, and I shouldn't be running varimax after PCA if I'm interested in component scores, then question 3 is moot really.
Sorry if any of this isn't clear or I'm totally off target. Any help with this would be appreciated. Thanks, Rich
Sorry if this is obvious, my R is pretty basic.
– Rich_b Oct 10 '17 at 12:00pych::principaldoes not work correctly; I'd say you don't need to use it for something so basic. Just use method #3. RegardingmydataX <- mydata[1:nrow(mydata), 1:17]- if there are 17 columns in your data then this line does not do anything at all, you don't need it and can usemydatadirectly. Iniristhere is 5th column with species name that had to be kicked out. – amoeba Oct 10 '17 at 12:21