2

I'm new to stats and using GraphPad Prism9 to look at homoscedacticity of a Kruskal Wallis test. Just wondering if this is heteroscedastic because it's cone shaped to me? Also would this mean that the Kruskal Wallis test is void as heteroscadistic means the distribution between each group is not the same? enter image description here

Naj
  • 21
  • Welcome to the site, Naj! These calls are always tough. One way to look at it: if we remove a single point (the top point at 8), is there still evidence of heteroscedasticity? Do we want our diagnosis of heteroscedasticity to hinge on a single observation? On the other hand, since heteroscedasticity is the more general case, it is can be viewed as conservative to go down that path. – John Madden Dec 15 '22 at 22:29
  • Thank you John! This really helped me understanding. From the perspective you've provided I do think it's homoscedastic. – Naj Dec 15 '22 at 23:29
  • a final question @JohnMadden, if you don't mind me asking: as homoscedasticity looks at variance would it be incorrect to use this to justify that distribution is the same between groups and therefore justify the use of the Kruskal Wallis test? I say this because Kruskal Wallis is a nonparametric test (so doesn't use variance) and one condition that must be met is identical distribution shapes between groups (not necessarily normal in nature). – Naj Dec 15 '22 at 23:33
  • What is your sample size? If there is overplotting in the graph, i.e., multiple points in one place that we can't see, visually this is hard to assess, and you better produce a graph that shows clearly how many observations are where. Also, I'd always plot the raw data as well, i.e., groups vs. y. In fact the raw data plot may show better whether any potential heteroscedasticity has a potential to mislead the test. – Christian Hennig Dec 18 '22 at 11:27

1 Answers1

4

The plot suggests mild heteroskedasticity (what are the sample sizes?).

However, this potential heteroskedasticity may be of little consequence.

Despite many sources saying otherwise, you do not have to have constant sample variance for the Kruskal-Wallis test to behave as it should.

You assume identical distributions under $H_0$ (in order that there's exchangeability) but you don't need to assume constant spread under $H_1$. Since $H_0$ is probably not exactly true, the appearance of the samples may not be helpful in relation to that; in particular, if spread changes as the mean increases (not necessarily proportionally), everything may still be fine.

Consider, for example, that the Kruskal-Wallis test is invariant to monotonic transformation. If you had constant variance across the whole of $H_1$ on some scale you would not have it after any nonlinear transformation of the variables -- yet the Kruskal-Wallis test would not change on the new scale; it would have the same test statistic. Clearly, then, the spread and the shape might both change as the locations change, without any harm to the Kruskal-Wallis.

Correspondingly, you don't need to be considering pure location-shift alternatives, either (another fairly common but mistaken assertion).

Consequently you might have nothing to worry about at all.

[Edit: one thing that does concern me a little is the seeming coincidence of multiple points; is the variable discrete?]

Where do the values in your plot come from? Are they output from the Kruskal-Wallis procedure in Graphpad Prism? If so, the terms in the plot -- 'residuals' and 'predicted values' -- seems as if it may itself be based on assuming a location-shift alternative, which as I mentioned needn't apply for the Kruskal-Wallis to be perfectly valid.

Does Graphpad Prism suggest that you must have homoskedasticity in your sample data to use Kruskal-Wallis?

Glen_b
  • 282,281