0

Consider I am doing observational study either case control or cohort in either of the following cases.

Case. 1 Suppose I found enrollment/rate of population with disease is too low where disease outcome is of interest.(Case control)

Case. 2 Suppose I found enrollment/rate of population with some risk factor too low where association between the risk factor and some disease is of interest.(Cohort)

Then I adjust my sample size by inflating it in either cases.

$Q:$ Does this constitute looking back inflating type I and II errors? In particular, I need to peek the data to figure out either covariate of interest or outcome of interest. After collection of data of a subject, the statistics is fixed without random fluctuation. This seems to be the case in case control study and retrospective cohort study. Isn't this computing some test statistics no matter what I do? Then inflated sample size is no longer matching the desired power.

$Q':$ Should one even be allowed to adjust sample size with previously defined power and confidence level, after discovering low enrollment/rate issue?

user45765
  • 1,416
  • 1
    What are your criteria for determining subsample sizes are "too low"? If those are based on not establishing the statistical significance of an expected effect, then you're cheating. If they are based on, say, pre-established criteria for minimum sample sizes, then with some care (such as not yet peeking at your data!) it's perfectly fine to enlarge the sample. You could even modify the sample design from (what sounds like) a simple random sample into a stratified sample, provided you correctly compute all the inclusion probabilities needed for making estimates and formal testing. – whuber Dec 23 '22 at 16:05
  • @whuber I find "peeking at your data" part is confusing to me as I have never seen formal definition of it. If I peek covariates(risk factors) without peeking outcome, is that peeking at my data? Or if I peek only outcome without covariates, is that peeking at my data? For case control, I probably would be interested in controlling disease outcome. It could be I estimated having 10 diseased people enrolling in, but later found out that there are only 5 whereas the undiseased part remains the same. Then the power will be decreased. – user45765 Dec 23 '22 at 16:20
  • @whuber I am not designing it, but I was told that I can modify sample size in observational study. That is why I am asking the question. – user45765 Dec 23 '22 at 16:23
  • 1
    https://stats.stackexchange.com/questions/310119 and https://stats.stackexchange.com/questions/20676 discuss this issue. It would be difficult to characterize every procedure one could apply to data and every decision one could take as a result in terms of the effects on the subsequent statistical analysis but some rules are clear: if the decision about when to halt data collection depends on the values of the data, the assumptions used to justify statistical tests with fixed sample sizes are violated. That means you would have to do much theoretical work to justify using your planned tests. – whuber Dec 23 '22 at 16:28
  • BTW, any modification of the sample size is part of the experimental design. So, like it or not, by contemplating additional data collection you are designing your experiment. – whuber Dec 23 '22 at 16:29
  • 1
    @whuber I see. Thanks a lot for the related posts' clarifications. – user45765 Dec 23 '22 at 16:32

0 Answers0