1

I created a dummy dataset and compared the performance of SKLearn LinearRegression and Keras. Why is Keras producing horrible results compared to Linear Regression?

Code:

# Create Dataset
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=5000, n_features=10, noise=0.1)

Build Linear Regression

from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error lr = LinearRegression() lr.fit(X,y) prediction_lr = lr.predict(X)

Build Keras Linear Regression

from keras import models from keras import layers model = models.Sequential() model.add(layers.Dense(1, activation='relu', input_dim=10)) model.compile(optimizer='rmsprop', loss='mse') model.fit(X,y, epochs=100, verbose=0) prediction_nn = model.predict(X)

print(f'LR MSE: {mean_squared_error(prediction_lr, y)}') print(f'NN MSE: {mean_squared_error(prediction_nn, y)}')

Output: LR MSE: 0.010068399696132291 NN MSE: 26936.27829985695

Why is there a dramatic difference of MSE? How can we replicate Linear Regression using Keras?

Thanks

  • You have a relu activation function in your output layer. You probably want a linear activation function that allows for values less than zero. // I disagree with the closure and think this is a specific problem that could generate an answer getting into what a neural network does to replicate linear regression (and I think this because I am writing such an answer in my head). // sklearn uses regularization by default, though regularization can be disabled (unlike in older versions). Disable the regularization to do ordinary least squares. – Dave Feb 07 '22 at 17:27

0 Answers0