19

How do I save the StandardScaler() model in Sklearn? I need to make a model operational and don't want to load training data agian and again for StandardScaler to learn and then apply on new data on which I want to make predictions.

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

#standardizing after splitting
X_train, X_test, y_train, y_test = train_test_split(data, target)
sc = StandardScaler()
X_train_std = sc.fit_transform(X_train)
X_test_std = sc.transform(X_test)
Abhinav Bajpai
  • 361
  • 1
  • 3
  • 10

2 Answers2

26

you could use joblib dump function to save the standard scaler model. Here's a complete example for reference.

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris

data, target = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(data, target)

sc = StandardScaler()
X_train_std = sc.fit_transform(X_train)

if you want to save the sc standardscaller use the following

from sklearn.externals.joblib import dump, load
dump(sc, 'std_scaler.bin', compress=True)

this will create the file std_scaler.bin and save the sklearn model.

To read the model later use load

sc=load('std_scaler.bin')

Note: sklearn.externals.joblib is deprecated. Install and use the pure joblib instead

Federico Dorato
  • 589
  • 6
  • 21
sukhbinder
  • 893
  • 11
  • 9
9

Or if you like to pickle:

import pickle
pickle.dump(sc, open('file/path/scaler.pkl','wb'))

sc = pickle.load(open('file/path/scaler.pkl','rb'))
Kevin Mc
  • 321
  • 2
  • 10
  • 2
    This should be the accepted answer. Although, I would prefer using `with open()..` instead of relying the gc to close the file. – np8 Apr 04 '21 at 10:41