I am using the R programming language. I am trying to figure out how to "recreate" plots in ggplot2/plotly, once they have been created in base R.
For example, I created some data and made a plot :
library(Rtsne)
library(cluster)
library(ggplot2)
library(dplyr)
library(dbscan)
library(plotly)
#generate data
var_1 <- rnorm(100,1,4)
var_2<-rnorm(100,10,5)
var_3 <- sample( LETTERS[1:4], 100, replace=TRUE, prob=c(0.1, 0.2, 0.65, 0.05) )
response_variable <- sample( LETTERS[1:2], 100, replace=TRUE, prob=c(0.4, 0.6) )
#put them into a data frame called "f"
f <- data.frame(var_1, var_2, var_3, response_variable)
#declare var_3 and response_variable as factors
f$response_variable = as.factor(f$response_variable)
f$var_3 = as.factor(f$var_3)
#create id
f$ID <- seq_along(f[,1])
#gower distance
gower_dist <- daisy(f[, -c(4,5)],
metric = "gower")
#lof plot
lof <- lof(gower_dist, k=3)
plot(gower_dist, pch = ".", main = "LOF (k=3)")
points(gower_dist, cex = (lof-1)*3, pch = 1, col="red")
text(gower_dist[lof>2,], labels = round(lof, 1)[lof>2], pos = 3)
Here is a picture of the plot:
Now, I am trying to recreate a similar plot in ggplot2 and plotly. I ran a statistical algorithm called tsne :
# tsne
tsne_obj <- Rtsne(gower_dist, is_distance = TRUE)
tsne_data <- tsne_obj$Y %>%
data.frame() %>%
setNames(c("X", "Y")) %>%
mutate(
name = f$ID)
I want the axis of the new graph to be : (tsne_data$X , tsne_data$Y).
I am not sure if it is still possible to recreate the previous plot in ggplot2 using the red circles (the radius of the red circles are proportionate to the "lof" score). Is it still possible to make a similar plot in ggplot2 as I did before? Is it possible to change the size of the points in ggplot2 to be proportional to the "lof" values?
I tried the following:
plot = ggplot(aes(x = X, y = Y), data = tsne_data) + geom_point(aes())
and got something like this:
Is it possible to change the size of these points based on the values of "lof" (e.g.
summary(lof)
hist(lof, breaks=10
)
Is it possible so that when you drag the mouse of a plotly rendition of the ggplot2, it displays the lof score and f$ID?
plotly_plot = ggplotly(plot)
Thanks!
Sources: https://www.rdocumentation.org/packages/dbscan/versions/1.1-5/topics/lof https://dpmartin42.github.io/posts/r/cluster-mixed-types