I am working on a dataset showing the number of birds at each site in two baseline studies 1 & 2. I have data for birds at each site for 20 days for each baseline. I wanted to find out which site has the highest number of birds in each baseline. I used Kruskal - Wallis test since the data isn't normally distributed and then pair-wise wilcox test to find the compare the sites. Is this a correct approach? I also have the data for number of birds on ground and on air at each site. I want to test if there is any significant difference between ground counts vs air counts. Which test is the best way to go about it?
1 Answers
If you have more than 2 sites represented (which you have implied) then I agree with your use of Kurskall-Wallis and Wilcox pairwise comparisons with DV being sites and IV being number of birds. To double check your normality assumption consider using a histogram and/or Q-Q plot to visually aid.
As for your second round of testing on air vs ground birds:
You could isolate the sites and do an independent samples t-test (or equivalent robust method) between ground and air through the days, for each site. So say you had 20 days worth of data, you'd have 20 observations of both air and ground counts and you could compare those counts.
Depending on the number of sites, you could total the number of air and ground bird at each site throughout the days and use each site as an observation. Therefore, if you had 10 sites you'd have 10 observations of both air and ground total counts. Similar to above, you could use and independent-samples t-test or another non-parametric method.
Based on your description I believe you may be able to do a two-way ANOVA, with site and status (ground vs air) being your IVs and counts being your DV. I've seen permutation discussed as a recommended robust strategy for this. See this answer
Of course you should consider which of these methods is most valuable to answer your specific research question regarding ground and flying birds.