I'm fairly new to linear models so I'd like to have an explanation for this phenomenon.
I calculated two linear models with R. The first one outputs this:
lm(formula = Gries_PM10_value ~ Gries_PM10_value_lag1 + Gries_LUTE_value +
Kalkleiten_WIGE_value)
Residuals:
Min 1Q Median 3Q
-23.1273 -4.8379 -0.2452 5.1194
Max
24.6230
Coefficients:
Estimate Std. Error
(Intercept) 15.83196 2.05175
Gries_PM10_value_lag1 0.47306 0.06918
Gries_LUTE_value -0.24091 0.16299
Kalkleiten_WIGE_value -1.65277 0.53690
t value Pr(>|t|)
(Intercept) 7.716 1.73e-12 ***
Gries_PM10_value_lag1 6.838 2.04e-10 ***
Gries_LUTE_value -1.478 0.14155
Kalkleiten_WIGE_value -3.078 0.00249 **
Signif. codes:
0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’
0.1 ‘ ’ 1
Residual standard error: 7.866 on 146 degrees of freedom
Multiple R-squared: 0.2966, Adjusted R-squared: 0.2821
F-statistic: 20.52 on 3 and 146 DF, p-value: 3.773e-11
The parameters describe two places, Gries and Kalkleiten which are 7km apart, then PM10 is a measure for fine particles, LUTE the temperature, WIGE the wind speed and PM10_lag1 the PM10 value from last day.
So this model behaved like I'd expect. The previous-day values of PM10 is very significant, the wind speed is also significant but the air temperature not so much.
But then I added the air temperature for Gries and I got:
lm(formula = Gries_PM10_value ~ Gries_PM10_value_lag1 + Gries_LUTE_value +
Kalkleiten_WIGE_value + Kalkleiten_LUTE_value)
Residuals:
Min 1Q Median 3Q
-19.0517 -4.2531 -0.0896 4.4151
Max
19.3326
Coefficients:
Estimate Std. Error
(Intercept) 22.85199 1.94546
Gries_PM10_value_lag1 0.28269 0.06309
Gries_LUTE_value -2.68195 0.34075
Kalkleiten_WIGE_value -1.95956 0.45343
Kalkleiten_LUTE_value 2.52227 0.32231
t value Pr(>|t|)
(Intercept) 11.746 < 2e-16 ***
Gries_PM10_value_lag1 4.481 1.50e-05 ***
Gries_LUTE_value -7.871 7.49e-13 ***
Kalkleiten_WIGE_value -4.322 2.86e-05 ***
Kalkleiten_LUTE_value 7.826 9.65e-13 ***
Signif. codes:
0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’
0.1 ‘ ’ 1
Residual standard error: 6.618 on 145 degrees of freedom
Multiple R-squared: 0.5054, Adjusted R-squared: 0.4918
F-statistic: 37.05 on 4 and 145 DF, p-value: < 2.2e-16
Now all the coefficients are strongly significant. Overall, this model seems to fit a lot better even though I wouldn't expect the air temperature to make such a big difference. What's the explanation behind this?