So say I am testing the predictive value of a predictor towards a "DV" within a survey data-set, between two time-points. Say also that I have measurements of both variables at all time points and that I can see that the DV and the predictor correlate at baseline. If I use linear regression or pearson correlation to evaluate the associaton between the predictor at baseline and the DV at a later time-point, do I need to correct for the correlation at baseline? I figure that if I do not correct, somehow, for this initial correlation I am not only testing prediction of the DV but also the test-retest-reliability of the predictor. Hope I am making sense.
-
2I've started some discussion in the chat room about making appropriate tags for such questions. I provide a useful set of related questions in there you might be interested in (there are a handful of related questions and references you might be interested in). In particular I would recommend viewing the answers to Best practice when analysing pre-post treatment-control designs. – Andy W Sep 03 '12 at 18:17
-
Well, there's no real control-group in my example, but maybe the information in your link still applies if I only understand it well enough? – Missing Bob Sep 03 '12 at 18:45
1 Answers
There are two general ways to do this. Method 1 is to use the baseline measurement as a covariate in a linear model, as in
$\small measurement_2 = B_1 * measurement_1 + B_0$
Method 2 is to use the difference score as the dependent variable in an intercept-only linear model, as in
$\small measurement_2 - measurement_1 = B_0$
Method 1 is generally more powerful than method 2. You can see this by realizing that you can derive method 2 from method 1 by adding measurement1 to both sides of the equation, as in:
$\small measurement_2 - measurement_1 (+ measurement_1) = (measurement_1) + B_0$
Effectively, method 2 assumes that the coefficient ($\small B_1$) of $\small measurement_1$ is equal to 1. Because, by definition, linear regression finds the linear combination of coefficients that minimizes error, method 1 must of necessity be more powerful than method 2. However, there are some instances where you might want to use method 2, particularly when, for example, you have a second variable (say a dichotomous variable, such as a randomized intervention) on which there are pretest differences on your variable of interest. In this special case, method 1 will yield biased results for the test of the effect of the intervention, whereas method 2 will be unbiased.
See Van Breukelen (2006) for more details.
- 1,772
- 3,242