I'm using R iris data to show my point:
pca = prcomp(iris[,-5],scale=F)
png('irisNS_biplot.png',600,600)
biplot(pca)
dev.off()
fviz_pca_biplot(pca, habillage=as.factor(iris$Species), addEllipses=TRUE, ellipse.level=0.95,
label = "var", col.var = "red", col.ind = "#696969", alpha.var ="cos2", repel = TRUE) +
theme_minimal() + theme_gray(base_size =12) + labs(title="", x ="PCA1", y = "PCA2")
ggsave("irisNS_ggplot.png", width = 8, height = 6, dpi = 300)
Look how the angle between “Petal Length” and “Sepal Length” differs between the two graphs. In the first graph, “Sepal Length” arrow points toward flower #119, while in the second graph, it points toward flower #132. Isn't it simply wrong? How can angles so different mean the same thing?
EDIT
Scaling the inputs:
pca = prcomp(iris[,-5],scale=T)
png('iris_biplot.png',600,600)
biplot(pca)
dev.off()
fviz_pca_biplot(pca, habillage=as.factor(iris$Species), addEllipses=TRUE, ellipse.level=0.95,
label = "var", col.var = "red", col.ind = "#696969", alpha.var ="cos2", repel = TRUE) +
theme_minimal() + theme_gray(base_size =12) + labs(title="", x ="PCA1", y = "PCA2")
ggsave("iris_ggplot.png", width = 8, height = 6, dpi = 300)
Now the difference between the angles is smaller, but in the biplot the “Sepal Length” arrow is between 123 and 106/136, while in the ggplot it is between 103/136 and 110.
I also noticed that biplot shows a different scale for the arrows in the top and right axes, though both seem to use an aspect ratio of 1. Ggplot had a different aspect ratio in the first graph, but close (or equal?) to 1 in the second. However, its aspect ratio for the arrows is hidden. Does that make this specific ggplot graphic less reliable?
EDIT2
Added coord_fixed() to ggplot, to make sure its aspect ratio is 1.
fviz_pca_biplot(pca, habillage=as.factor(iris$Species), addEllipses=TRUE, ellipse.level=0.95,
label = "var", col.var = "red", col.ind = "#696969", alpha.var ="cos2", repel = TRUE) +
theme_minimal() + theme_gray(base_size =12) + labs(title="", x ="PCA1", y = "PCA2") +
coord_fixed()
ggsave("iris_ggplotASP1.png", width = 8, height = 6, dpi = 300)
Still, the arrows don't pass in the same position relative to the points, as in the biplot function. I see the numbers in the left/bottom axes are different between functions, but scaling the values equally along both axes shouldn't change the angle of the arrows, relative to the angle of the points (in relation to the origin (0,0) of the graph). So I guess the arrows use a different coordinate system, and this is not respecting the aspect ratio of 1, as (I thought) it should. Or am I missing something?







scale = TRUEinprcomp? – dipetkov Jun 29 '22 at 22:20?biplotproduces the advice, "There are many variations on biplots (see the references)..." which at least heads you in a productive direction. Indeed, one of the references is an entire book on this plot (!), "J.C. Gower and D. J. Hand (1996). Biplots. Chapman & Hall." A search for it turns up another more recent book, https://onlinelibrary.wiley.com/doi/book/10.1002/9780470973196. – whuber Jun 30 '22 at 16:48