I am quite new to statistical analysis and have what is probably a newbie question. I have a dataset that contains year and state data for two different groups. I am trying to compare the group means for each state and year combination (in total it is around 40 unique state-year combinations), however am running into a few problems:
- The distributions for each group by year and state are not normal.
- The sample sizes are erratic, i.e for 2018 + Kansas, group 1 may have a sample size of 1100 and group 2 may have a sample size of 3600. Or, in extreme cases group 1 may have a sample size of 10,000 and group 2 may have a sample size of 3,000.
- In general the variances are equivalent (though not always).
Based on this I felt that a Mann-Whitney non-parametric test would be the best way to compare the distributions of each group within each state and year. However, I have been doing some reading and noticed that the statistical power for Mann-Whitney is reduced when sample sizes are overly large.
Is there any guidance on the best statistical test to compare these various combinations? Any resources or feedback would be greatly appreciated!