4

So say I am testing the predictive value of a predictor towards a "DV" within a survey data-set, between two time-points. Say also that I have measurements of both variables at all time points and that I can see that the DV and the predictor correlate at baseline. If I use linear regression or pearson correlation to evaluate the associaton between the predictor at baseline and the DV at a later time-point, do I need to correct for the correlation at baseline? I figure that if I do not correct, somehow, for this initial correlation I am not only testing prediction of the DV but also the test-retest-reliability of the predictor. Hope I am making sense.

1 Answers1

3

There are two general ways to do this. Method 1 is to use the baseline measurement as a covariate in a linear model, as in

$\small measurement_2 = B_1 * measurement_1 + B_0$

Method 2 is to use the difference score as the dependent variable in an intercept-only linear model, as in

$\small measurement_2 - measurement_1 = B_0$

Method 1 is generally more powerful than method 2. You can see this by realizing that you can derive method 2 from method 1 by adding measurement1 to both sides of the equation, as in:

$\small measurement_2 - measurement_1 (+ measurement_1) = (measurement_1) + B_0$

Effectively, method 2 assumes that the coefficient ($\small B_1$) of $\small measurement_1$ is equal to 1. Because, by definition, linear regression finds the linear combination of coefficients that minimizes error, method 1 must of necessity be more powerful than method 2. However, there are some instances where you might want to use method 2, particularly when, for example, you have a second variable (say a dichotomous variable, such as a randomized intervention) on which there are pretest differences on your variable of interest. In this special case, method 1 will yield biased results for the test of the effect of the intervention, whereas method 2 will be unbiased.

See Van Breukelen (2006) for more details.

jonsca
  • 1,772