How to apply Wilcoxon test to small sample with NAs?

Question

I have the following data which show some values before and after a drug. I want to test whether the drug increased the mean. I think I should use Wilcoxon test, but I don't know what to do with missing values (NAs). Do I need to remove NAs and corresponding before values?

before after
2.2    NA
8.2    5.2
9.5    NA
10.5   4
12.5   6
10.5   3
9.7   10
11.5   NA

You might want to know why the values are missing since that could bias your conclusion. If there's nothing systematic about the missingness (people don't have after measurements because they died, say) then I think you can safely throw them out. — dsaxton, Dec 07 '15 at 14:20

score 1 · Answer 1 · answered Dec 07 '15 at 13:53

1

I think you should remove the data with NA though the data set is already very small, i don't believe you can make an inference and hypotize some values if you are studying the drug for the first time.

answered Dec 07 '15 at 13:53

GGA

473

That is also what I thought – Günal Dec 07 '15 at 14:04
Hypotise = hypothesise? – Nick Cox Dec 07 '15 at 14:33

score 1 · Answer 2 · edited Apr 13 '17 at 12:44

First, you should consider if Wilcoxon is here the best choice. Check How to choose between t-test or non-parametric test e.g. Wilcoxon in small samples thread for great answer by Glen_b where he compares pros and cons of different hypothesis tests for small samples. It appears that in very small samples $t$-test may be the most robust choice.

Second, removing the missing values in most cases is not the best option. What do you know about missings? How did it happen that you lack information about those cases? If they are missing totally at random then one thing that you could do is to impute missing values with some values (e.g. mean, median, mode), use multiple imputation, or a number of different techniques of dealing with missing values. If there is some pattern in when values are missing then maybe you should consider the fact that some cases are missing in your statistical model as well?

For general theoretical introduction you could try the Rubin (1976) paper "Inference and Missing Data", classic book by Little and Rubin "Statistical analysis with missing data" or a number of different sources that can be easily found on Google Scholar.

How to apply Wilcoxon test to small sample with NAs?

2 Answers2