0

I was working through some data following some examples on https://www.askpython.com/python/examples/plot-k-means-clusters-python

However, when I try to plot the data, I get 'name "label" is not defined' error.

Below is sample data

d = {'pca1': [0.844536, 0.844536, -0.365379, 0.844536, -0.129814, -0.158512, -0.158512],
      'pca2': [0.222014, 0.222014, 0.468174, 0.222014, -0.280464, -0.481638, -0.481638],
      'label': [2, 2, 5, 2, 4, 3, 3]}


import pandas as pd

df = pd.DataFrame(d)

and code to create the plot

import numpy as np
import matplotlib.pyplot as plt

ul = np.unique(df['label']

for i in ul:
    plt.scatter(df[label == i, 0], df[label == i, 1], label = i)
plt.legend()
plt.show()
Trenton McKinney
  • 43,885
  • 25
  • 111
  • 113
GSA
  • 357
  • 3
  • 9
  • 2
    You have a typo `df[label == i, 0]`, which should be `df[df.label == 2].iloc[:, 0]` or `df.loc[df.label == 2, 'pca1']`. The Boolean selection of data is incorrect. See the accepted answer of the duplicate – Trenton McKinney Nov 19 '21 at 21:16

0 Answers0