0

I am trying to do linear regression with one feature only: predicting height with weights. Gradient descent took too many epochs so I used a min max scaler and it converged to the optimum point pretty quickly.

However, predictions are now too high. What do I need to do to get correct predictions? Here's my code:

def min_max_scaler(arr):
    x = arr.copy()
    minimum = np.min(x,axis=0)
    maximum = np.max(x,axis=0)
    x = (x - minimum) / (maximum - minimum)
    return x
class LinearRegression:
  def __init__(self,theta):
    self.theta = theta

def predict(self,X): return X @ self.theta

def compute_cost(self,X,y): yhat = self.predict(X) m = len(y) return (1/m) * np.sum((yhat-y)**2)

def train(self,X,y,alpha,epochs): m,n = X.shape cost_history = np.zeros(epochs) for i in range(0,epochs): nabla = np.ones(n) for j in range(0,n): nabla[j] = (2/m) * np.sum((self.predict(X) - y)@X[:,j]) self.theta -= alpha * nabla cost_history[i] = self.compute_cost(X,y) return cost_history

Roland
  • 736
  • 1
    Dear @Nabin, welcome to SO. In order to help you better, could you please provide details about predictors being "too high"? Ideally, it would help to provide a small dataset where the problem occurs, in order to illustrate what you mean. – Roland Aug 04 '20 at 08:11
  • The way the min_max_scaler function is defined, you're only retrieving the max/min of a particular array, but not saving it for the future. That's not how scaling should be done: you want to store the scaling values used for your training data so you can use the same values for your testing data. See: https://stats.stackexchange.com/questions/174823/how-to-apply-standardization-normalization-to-train-and-testset-if-prediction-i/174865#174865 – Sycorax Aug 04 '20 at 12:53

0 Answers0