0

I am running a simple clustering analysis and when I plot it, the legend is not working as expected. The legend is just showing 2 values, (o and 1). I was expecting to see the clusters, 0-4 with the color they represent in the graph. Here is the code I have so far:

%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()
import numpy as np
from sklearn.datasets.samples_generator import make_blobs
from sklearn.cluster import KMeans

#Creating DataFrame for plotting
X= pd.DataFrame({"Times_Correct":tmpX['Times_Correct'], "Times_Incorrect": tmpX['Times_Incorrect']})

#Plotting fans correct versus incorrect predictions
plt.scatter(X['Times_Correct'], X['Times_Incorrect'], s=50);

#Clustering
kmeans = KMeans(n_clusters=5)
kmeans.fit(X)
X_kmeans = kmeans.predict(X)

#Plotting clusters
plt.scatter(tmpX['Times_Correct'], tmpX['Times_Incorrect'], c=X_kmeans,  s=50, cmap='viridis')
plt.xlabel("Times Correct", size=12)
plt.ylabel("Times Incorrect", size=12)

#add legend to the plot
plt.legend(np.unique(X_kmeans))

Current output

Is there any way to show the unique values in X_kmeans and its associated colors or do I need to rework the logic here?

Thank you in advance for any help

  • See also https://stackoverflow.com/questions/39091515/matplotlib-does-not-show-legend-in-scatter-plot – JohanC Jun 21 '21 at 19:58

0 Answers0