Saving StandardScaler() model for use on new datasets

Question

How do I save the StandardScaler() model in Sklearn? I need to make a model operational and don't want to load training data agian and again for StandardScaler to learn and then apply on new data on which I want to make predictions.

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

#standardizing after splitting
X_train, X_test, y_train, y_test = train_test_split(data, target)
sc = StandardScaler()
X_train_std = sc.fit_transform(X_train)
X_test_std = sc.transform(X_test)

score 26 · Accepted Answer · edited Apr 29 '20 at 12:57

you could use joblib dump function to save the standard scaler model. Here's a complete example for reference.

from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris

data, target = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(data, target)

sc = StandardScaler()
X_train_std = sc.fit_transform(X_train)

if you want to save the sc standardscaller use the following

from sklearn.externals.joblib import dump, load
dump(sc, 'std_scaler.bin', compress=True)

this will create the file std_scaler.bin and save the sklearn model.

To read the model later use load

sc=load('std_scaler.bin')

Note: sklearn.externals.joblib is deprecated. Install and use the pure joblib instead

score 9 · Answer 2 · answered Dec 03 '19 at 20:33

9

Or if you like to pickle:

import pickle
pickle.dump(sc, open('file/path/scaler.pkl','wb'))

sc = pickle.load(open('file/path/scaler.pkl','rb'))

answered Dec 03 '19 at 20:33

Kevin Mc

321
2
10

2

This should be the accepted answer. Although, I would prefer using `with open()..` instead of relying the gc to close the file. – np8 Apr 04 '21 at 10:41

Saving StandardScaler() model for use on new datasets

2 Answers2

Linked

Related