3

An example of what I mean:

I have a certain essay from all students at a university. I take a 1% random sample (not stratified) and run some time consuming computational analysis on each essay independently. I realize I forgot to include students whose names start with "M" in the sampled population. The "M" essays were collected at the same time and under the same conditions as all the others. I take a 1% random sample from the "M" essays and run the same independent analysis on each. I combine the independent analyses for a statistical analysis. (Presume there are no rounding issues.)

  • Is there some reason why the "M"s were separate? Or did you just not get any "M"s by chance when you drew your 1st sample? – gung - Reinstate Monica Mar 02 '15 at 23:59
  • 1
    @gung I forgot to include "M"s in the original population. Just now edited for clarity. – Warren Whipple Mar 03 '15 at 00:07
  • Naïvely, doesn't it depend on the true distribution of whatever it is you are trying to measure. If there should be no correlation between that and student last name, then it shouldn't matter. But if you are concerned that there may be a relatonship between the target and student last name, or some other latent variable which may find expression in student last name (like ethnicity or family culture) then you do have a potential problem. – Avraham Dec 22 '21 at 14:26

0 Answers0