0

I am trying to prepare data for the neural network. Below you can see data

# load existing Iris dataset
from sklearn.preprocessing import MinMaxScaler
from sklearn import datasets
iris = datasets.load_iris()

# make a new dataframe
df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
df['target'] = iris.target

enter image description here

After loading of data, I want to divide the data into test and training sets and after that perform normalization.

I want to normalize only the first two columns (sepal length (cm) and sepal width (cm)) with a min-max scaler, and after that to concatenate with last two columns ( petal length (cm) and petal width (cm)). After normalization data are converted into numpy.ndarray and I can't merge these four columns together.

np.random.seed(seed=1)

train_data = df.sample(frac=0.8, random_state=0)
test_data = df.drop(train_data.index)

target_fields = ['target']

X_train, Y_train = train_data.drop(target_fields, axis=1), train_data[target_fields]
X_test, Y_test = test_data.drop(target_fields, axis=1), test_data[target_fields]

# Min Max scaler
scaler = MinMaxScaler(feature_range=(0, 1))

X_train = scaler.fit_transform(X_train)
X_test = scaler.fit_transform(X_test)

Y_train = scaler.fit_transform(Y_train)
Y_test = scaler.fit_transform(Y_test)

How to solve this problem?

desertnaut
  • 52,940
  • 19
  • 125
  • 157
silent_hunter
  • 1,301
  • 7
  • 20
  • Irrelevant to your issue, but you are using the scaling wrong: we use `fit_transform` only on the *training* data; in the test ones we use `transform` (i.e. we do not *fit* again). Moreover, when we want to scale both our features `X` and dependent variables `y`, we use two separate scalers and not one for both - see my answer here: https://stackoverflow.com/questions/48973140/how-to-interpret-mse-in-keras-regressor/49009442#49009442 – desertnaut Mar 22 '22 at 13:56

0 Answers0