2

I'm trying to calculate the standard error of a predictor variable for a simple linear regression, but I'm not sure how to manually calculate the standard error. Everything I've searched for results in computing the residual standard error or the estimate of the predictor coefficient.

For example,

Call:
lm(formula = data$price ~ data$sqft_living)

Residuals:
Min      1Q  Median      3Q     Max 
-418475 -136416  -34874  108445 1259547 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)      38289.31   61277.30   0.625    0.534    
data$sqft_living   230.22      27.15   8.481 2.36e-13 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 232400 on 98 degrees of freedom
Multiple R-squared:  0.4233,    Adjusted R-squared:  0.4174 
F-statistic: 71.93 on 1 and 98 DF,  p-value: 2.362e-13

I'm trying to manually compute the 27.15 under the standard error of sqft_living, but I can't find a formula anywhere.

Thanks in advance!

mdlee6
  • 23
  • 2
    What you want is the standard error of the coefficient estimate and not the predictor itself. The formula for the standard error of $\hat{\beta}_1$ can be found on the Wikipedia page for simple linear regression: https://en.wikipedia.org/wiki/Simple_linear_regression in the "Normality assumption" section – dsaxton Sep 22 '16 at 19:26

1 Answers1

3

The standard error (as the comment pointed out) is for the coefficient $\beta$ and not your predictor variable.

In simple linear regression we fit the model $y_i = \alpha + \beta x_i + \epsilon_i$.

The way the standard error of $\beta$ is derived is quite simple. Firstly, in simple linear regression we find $\beta$ using the formula

$\hat{\beta} = \frac{\sum_1^n (x_i - \bar{x})(y_i - \bar{y})}{\sum_1^n (x_i - \bar{x})^2}$

If we assume that the error terms $\epsilon_i$ are normally distributed, then our estimate $\hat{\beta}$ will also be normally distributed.

An estimate for the standard error of $\hat{\beta}$ is given by the formula

$\hat{SE}(\hat{\beta}) = \sqrt\frac{\frac{1}{n-2}\sum_1^n (\hat{y_i}-y_i)^2}{\sum_1^n (x_i-\bar{x})^2}$

This follows from the fact that $\hat{\beta}$ has mean $\beta$ and variance given by $\frac{Var(\epsilon_i)}{\sum_1^n (x_i-\bar{x})^2}$ and the unbiased estimate of $Var(\epsilon_i)$ is given by the numerator in the previous formula.

J. Auon
  • 139
Patty
  • 1,759
  • how do you get the 2nd formula 2) why does $\beta$ have variance $\frac{Var(\epsilon_i)}{\sum_1^n (x_i-\bar{x})^2}$? I feel that there are a number of immediate steps missing at the end
  • – Trajan Nov 01 '20 at 19:54
  • Your answer does not work of the intercept, B0, if, for no other reason, the denominator of the expression is zero. The formula you give is all over the Internet, but no one gives a similar expression for the intercept. – CElliott Aug 21 '21 at 21:29