I'm trying to get a deeper understanding of how OLS works. One thing that I thought I understood is the difference between standard errors and residuals.
Here are two definitions
- Standard errors: The average distance that the observed values fall from the regression line.
- Residuals: The difference between the actual value and the value predicted by the model ($y_i - \hat y_i$) for any given point.
Where I always assumed that number 2 was unobservable (Actually in this post they claim it's the other way around: https://stats.stackexchange.com/a/232588/334202).
But if I run a simple regression in R like this I get both! So how can I think about this?
library(tidyverse)
library(broom)
mtsmall <- mtcars |>
rownames_to_column(var="carnames") |>
as_tibble() |>
select(mpg,hp,wt)
model1 <- lm(mpg ~ hp, mtsmall)
mtsmall_predicted <- augment_columns(model1, mtsmall) |>
rename(.mpg_hat = .fitted)
mtsmall_predicted |> head(5)
Output:
mpg hp wt .mpg_…¹ .se.fit .resid .hat .sigma .cooksd
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 21 110 2.62 22.6 0.777 -1.59 0.0405 3.92 3.74e-3
2 21 110 2.88 22.6 0.777 -1.59 0.0405 3.92 3.74e-3
3 22.8 93 2.32 23.8 0.873 -0.954 0.0510 3.92 1.73e-3
4 21.4 110 3.22 22.6 0.777 -1.19 0.0405 3.92 2.10e-3
5 18.7 175 3.44 18.2 0.741 0.541 0.0368 3.93 3.89e-4