70

I am getting

TypeError: unhashable type: 'slice'

when executing the below code for encoding categorical data in Python. Can anyone please help?

# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset
dataset = pd.read_csv('50_Startups.csv')
y=dataset.iloc[:, 4]
X=dataset.iloc[:, 0:4]

# Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X = LabelEncoder()
X[:, 3] = labelencoder_X.fit_transform(X[:, 3])
piRSquared
  • 265,629
  • 48
  • 427
  • 571
kausik Chat
  • 809
  • 1
  • 6
  • 3
  • 2
    What's in the csv file? In which line you got the TypeError? – am.rez Apr 08 '17 at 04:39
  • 1
    Please add the corresponding stack trace to your question. – Robert Valencia Apr 08 '17 at 05:28
  • Tell use about the `dataset`? I suspect its 'iloc' is expecting a string column label, not 2d array like slicing. The error implies that a `slice` (e.g 0:4) is being used in as dictionary key, or something like that. – hpaulj Apr 08 '17 at 06:31

7 Answers7

104

X is a dataframe and can't be accessed via slice terminology like X[:, 3]. You must access via iloc or X.values. However, the way you constructed X made it a copy... so. I'd use values

# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset
# dataset = pd.read_csv('50_Startups.csv')

dataset = pd.DataFrame(np.random.rand(10, 10))
y=dataset.iloc[:, 4]
X=dataset.iloc[:, 0:4]

# Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X = LabelEncoder()

#  I changed this line
X.values[:, 3] = labelencoder_X.fit_transform(X.values[:, 3])
kevinji
  • 10,259
  • 4
  • 36
  • 56
piRSquared
  • 265,629
  • 48
  • 427
  • 571
8

use Values either while creating variable X or while encoding as mentioned above

# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset
# dataset = pd.read_csv('50_Startups.csv')

dataset = pd.DataFrame(np.random.rand(10, 10))
y=dataset.iloc[:, 4].values
X=dataset.iloc[:, 0:4].values
Renu
  • 219
  • 3
  • 1
4

While creating the matrix X and Y vector use values.

X=dataset.iloc[:,4].values
Y=dataset.iloc[:,0:4].values

It will definitely solve your problem.

Stephen Kennedy
  • 18,869
  • 22
  • 90
  • 106
Gurbaksh Singh
  • 131
  • 1
  • 2
1

if you use .Values while creating the matrix X and Y vectors it will fix the problem.

y=dataset.iloc[:, 4].values

X=dataset.iloc[:, 0:4].values

when you use .Values it creates a Object representation of the created matrix will be returned with the axes removed. Check the below link for more information

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.values.html

Navarasu
  • 6,991
  • 2
  • 19
  • 31
chandan p
  • 11
  • 3
0

I was getting same error (TypeError: unhashable type: 'slice') with below code:

included_cols = [2,4,10]
dataset = dataset[:,included_cols]  #Columns 2,4 and 10 are included.

Resolved with below code by putting iloc after dataset:

included_cols = [2,4,10]
dataset = dataset.iloc[:,included_cols]  #Columns 2,4 and 10 are included.
Robert
  • 5,231
  • 43
  • 62
  • 114
Sunitha G
  • 119
  • 1
  • 6
0

Try by changing X[:,3] to X.iloc[:,3] in label encoder

Anvesh
  • 63
  • 1
  • 9
-2

Your x and y values ​​are not running so first of all youre begin to write this point

 import numpy as np
 import pandas as pd
 import matplotlib as plt

 dataframe=pd.read_csv(".\datasets\Position_Salaries.csv")

 x=dataframe.iloc[:,1:2].values 
 y=dataframe.iloc[:,2].values    
 x1=dataframe.iloc[:,:-1].values 

point of value have publish

Ali ÜSTÜNEL
  • 143
  • 1
  • 7