I'm confused about some of the results I got after plotting my data.
I have a data set that includes tests scores and a binary group assignment of either polypharmacy or non-polypharmacy. the scores are unpaired. I have 132 overall observation with 35 being non-polypharmacy.
Here is a reproducible example of my data:
structure(list(MOCA = c(11, 11, 14, 13, 10, 11, 12, 16, 6, 13, 12, 10, 14, 4, 5, 8, 7, 13, 5, 12, 14, 7, 15, 8, 11, 12, 14, 16, 3, 10, 16, 9, 7, 14, 14, 10, 4, 12, 16, 12, 13, 5, 12, 9, 13, 11, 14, 13, 12, 11, 10, 12, 9, 11, 14, 10, 2, 14, 16, 16, 13, 9, 13, 11, 12, 12, 16, 14, 12, 7, 13, 14, 11, 13, 16, 13, 14, 6, 10, 11, 13, 14, 9, 16, 13, 16, 8, 12, 12, 11, 11, 12, 14, 11, 14, 6, 11, 13, 12, 12, 12, 1, 14, 16, 9, 16, 10, 12, 16, 13, 6, 11, 17, 11, 13, 9, 14, 13, 13, 14, 14, 13, 4, 7, 12, 13, 12, 14, 16, 14, 13, 14), Group = c("Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Polypharmacy", "Non-Polypharmacy", "Non-Polypharmacy")), class = c("rowwise_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, -132L), groups = structure(list( .rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 72L, 73L, 74L, 75L, 76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, 84L, 85L, 86L, 87L, 88L, 89L, 90L, 91L, 92L, 93L, 94L, 95L, 96L, 97L, 98L, 99L, 100L, 101L, 102L, 103L, 104L, 105L, 106L, 107L, 108L, 109L, 110L, 111L, 112L, 113L, 114L, 115L, 116L, 117L, 118L, 119L, 120L, 121L, 122L, 123L, 124L, 125L, 126L, 127L, 128L, 129L, 130L, 131L, 132L), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr", "list"))), row.names = c(NA, -132L), class = c("tbl_df", "tbl", "data.frame")))
I used Wilcox.test to test the significances with produced the following results:
Wilcoxon rank sum test with continuity correction
data: MOCA by Group W = 2338, p-value = 0.0008804 alternative hypothesis: true location shift is not equal to 0 95 percent confidence interval: 0.9999611 2.9999549 sample estimates: difference in location 1.999979 So according to this test, my results are indeed significant but i wanted to visually confirm using a box plot so I plotted the data and found that both plots overlap quite a bit. Here is the box plot in question:
Could some enlighten me as to what is going on?? Are my results valid (if so, is there a some way i can prove it?)? Did I do something wrong? Any advice/insights would be greatly appreciated!!

