I pasted the data here: The dataframe contains multiple observations on x and y per country. Each country is also part of a region.
Based on this post, I managed to draw ploygons/clusters in the scatterplot using ggplot based on the same factor as the colors of my points are based on (i.e., country). Here's the code I used:
find_hull <- function(df) df[chull(df$x, df$y), ]
hulls <- ddply(df, "country_name", find_hull)
plot <- ggplot(data = df, aes(x = x, y = y, colour=country_name, fill=country_name)) +
geom_point() +
geom_polygon(data = hulls, alpha = 0.5)
plot
But what if I want to draw the polygons based on region, and still have the color assigned by country? Just changing country_nameto regionwhen ddplying the find_hullfunction did not produce satisfying results.
I have the feeling it's because I do not fully understand yet what the chullfunction does, but I didn't manage to wrap my head around it.