I'm using Gini coefficient and Lorenz Curve plots to show the accumulation of beneficiaries in ecosystem services (ES) supply points, in R. I classify ES into three categories and calculate Gini and Lorenz curve for each category. However, the highest value doesn't correspond with the biggest area in the plot. Why is this happening?
Here you can download the dataframe
And here is the script:
library(ggplot2)
library(gglorenz)
library(ineq)
occ_hubs <- read.csv2('occ_hubs.csv')
#calculate GINI per categories
Category <- 1
Gini_value <- 1
j <- 1
for (i in unique(occ_hubs$categoria)){
subset <- subset(occ_hubs, categoria == i)
gini_cat <- Gini(subset$id_hub, subset$Freq)
Category[j] <- i
Gini_value[j] <- gini_cat
j <- j+1
print(c(i, gini_cat))
}
ginis_cat <- as.data.frame(Category)
ginis_SE$Gini <- Gini_value
#Plot Lorenz Curve by cateogries
ggplot(occ_hubs, aes(x=Freq, color=categoria)) +
stat_lorenz(desc = TRUE, size=1) +
coord_fixed() +
geom_abline(linetype = "dashed") +
theme_minimal() +
hrbrthemes::scale_x_percent() +
hrbrthemes::scale_y_percent() +
hrbrthemes::theme_ipsum() +
labs(x = "Cumulative percentage of hubs",
y = "Cumulative Percentage of total flow",
legend.title='')+
theme(legend.title=element_blank())
Thank you.