17

This wiki page Simple linear regression has formulas to calculate $\alpha$ and $\beta$. Could anyone tell me how to derive the formulas in weighted case?

Wei Shi
  • 315
  • A derivation is in the Wikipedia article: http://en.wikipedia.org/wiki/Least_squares#Weighted_least_squares – whuber Jul 05 '11 at 16:29

2 Answers2

24

Think of ordinary least squares (OLS) as a "black box" to minimize

$$\sum_{i=1}^n (y_i - (\alpha 1 + \beta x_i))^2$$

for a data table whose $i^\text{th}$ row is the tuple $(1, x_i, y_i)$.

When there are weights, necessarily positive, we can write them as $w_i^2$. By definition, weighted least squares minimizes

$$\sum_{i=1}^n w_i^2(y_i - (\alpha 1 + \beta x_i))^2$$

$$=\sum_{i=1}^n (w_i y_i - (\alpha w_i + \beta w_i x_i))^2 .$$

But that's exactly what the OLS black box is minimizing when given the data table consisting of the "weighted" tuples $(w_i, w_i x_i, w_i y_i)$. So, applying the OLS formulas to these weighted tuples gives the formulas you seek.

whuber
  • 322,774
8

The answer from whuber gives the intuition behind the maths, which is nice to have, but I could still not figure the formulas (i.e. where should I put the weights). After some search on the web, I found these slides which give the following:

You want to minimize the following error: $$\sum_{i=1}^n (w_i(y_i - (\alpha + \beta x_i)))^2$$

Then, the optimal pair $(\hat\alpha,\hat\beta)$ is: $$\hat\alpha = \overline y_w - \hat\beta \overline x_w$$ $$\hat\beta = \frac{\sum_{i=1}^n w_i(x_i-\overline x_w)(y_i-\overline y_w)}{\sum_{i=1}^n w_i(x_i-\overline x_w)^2}$$

Where $\overline x_w$ and $\overline y_w$ are the weighted means:

$$\overline x_w = \frac{\sum_{i=1}^n w_ix_i}{\sum_{i=1}^n w_i}$$ $$\overline y_w = \frac{\sum_{i=1}^n w_iy_i}{\sum_{i=1}^n w_i}$$