Many psychology studies draw conclusions from experiments that follow a design along the lines of:
- take two groups, and subject each of them to a different manipulation (e.g. task instructions etc)
- measure both groups on one and the same dependent variable
- if a difference is found in that DV, claim that the manipulation is what explains it.
For instance, take two studies done by D. Kahneman et al. (don't have exact refs sorry):
Study 1: both groups asked to read sentences and either agree or disagree with their truth value; group A is asked to nod their heads (yes) while reading, while group B is asked to shake their heads (no). Group A is found to agree with the statements in a higher proportion than group B.
Study 2: both groups are asked to make liking ratings on topics T1 and T2, but the ratings are presented in different orders (group A: T1 then T2; group B: T2 then T1). A correlation between ratings of T1 and T2 is found for group B but not for group A.
Is this type of inference well-founded? To me it seems not - unless there is additional evidence of the two groups not differing in the absence of (or prior to) the different manipulation, i.e. some form of proof for a baseline equivalence in the absence of the experimental manipulation. Such evidence is, however, almost always missing from the studies reporting them, with the few exceptions of those reporting "pre" and "post" measurements (which satisfy the condition I mention above).
In the absence of such additional evidence, how is the trivial confound of the two groups simply having different baseline preferences for the thing they are require to rate, a confound that could well explain the results single-handedly, meant to be corrected for? Surely a simple independent samples t-test is not enough to account for these baseline differences, but is it just a case of replacing this with a more sophisticated statistical test?!