1

I was following this other example on how to find lasso regression in r . The image posted below is the image of the codes. I need help with find the RMSE

Edmond
  • 11
  • 1
    Hi, welcome to the site! We can't see your image, could you try again? But regardless, the only hard part in computing RMSE is getting extracting the predictions. Once you've got those, RMSE is simply, in python notation, $\texttt{np.sqrt(np.mean(np.square(pred-y)))}$. – John Madden Dec 30 '22 at 19:47
  • sqrt(mean((y-predictions)^2)) is the R syntax. If you need help extracting the predictions from the model, a software site like the original Stack Overflow could be a valuable resource. – Dave Dec 30 '22 at 19:53

1 Answers1

0

Let’s break down the term.

ROOT: Take the square root.

MEAN: Add up a bunch of numbers, and divide by the number of values you added up.

SQUARED: Multiple a number by itself (square the number).

ERROR: The error is by how much you missed the true value, so the difference between truth and predictions. (You can subtract in either order, since the value is squared.)

Consequently, root mean squared error, where $\hat y_i$ is the $i^{th}$ prediction and $n$ is the sample size, is:

$$ RMSE=\sqrt{ \dfrac{ \sum_{i=1}^n\left( y_i - \hat y_i \right)^2 }{ n } } $$

(If this reminds you of standard deviation, there is a reason the formulas look so similar. RMSE relates to the estimated standard deviation of a regression error term.)

Once you extract predictions from the model, just apply this formula.

Some sources will divide by $n-1$, $n-2$ or $n-p$. These all relate to other ways of estimating the regression error standard deviation and have their advantages. Without context to clarify the exact formula, though, I would default to dividing by the sample size $n$ (particularly for a LASSO model).

Dave
  • 62,186