4

one of the reviewer of a paper of mine suggested to perform a homoscedasticity test between the results of two experiments, testing the same thing in two conditions. The experiments consisted in ratings along a 7 points Likert-scale. One of the experiment results were distributed on a large range of values, while the other showed a tendency towards the center of the Likert-scale range (i.e. 3.5). The reviewer argued that the latter behaviour could be due to the fact that participants answered more randomly, and he suggested that to verify this possibility, an homoscedasticity test should be performed comparing the two conditions.

Now, I would like to understand this comment. In relation to my case, what does it mean that the variances of the two conditions are significantly different? What instead if they are equal?

Secondly and more importantly, which test for homoscedasticity do you suggest I perform? I use R. Can you suggest the function more suitable for my case? I saw that there are many.

L_T
  • 1,713

2 Answers2

2

I think this question has been addressed a few times before although not in the context of your paper. Even though the likert scale is ordinal, some people here argue that average scores can be looke at as numerical (interval). Ascribing to that theory the variance of the averages has some meaning. Also a Likert scale even though it is discrete the individual scores could be thought of as interval if you believe that respondents think the change from 2 to 3 is the same as from 3 to 4 etc. This may be a stretch. Of course if the raw data were normally distributed you could use the F test for comparing two normally distributed variables. Unless the sample size is large it is difficult to reject homoscedasticity. it has also been pointed out that the F test is not robust to depatures from normality. So it is safer to use a robust test. Levene's test is one of the most common ones and there are modifications to it that have been proposed. Still you may be uncomfortable treating ordinal data as though it was were interval. So maybe a nonparametric test based only on ranking the data or on permutations would be suitable. then you really are asking if the distributions have different scales.

There are apparently three well-known nonparametric tests for equality of scale in two distributions with well-known properties and some new one appearing in the literature. A Google search using "nonparametric tests of scale" turns up a lot of interesting links. There is even one that discuss whether or not Likert scales should be viewed parametrically or nonparametrically with a paper like yours instigating the discussion. Do the search.

But to help out here are links to a few papers. Full pdfs are not available

  1. Mathur
  2. Klotz
  3. Penfield.
  • Dear @Michael Chernick thanks for your answer. Although it enligthened me, still I am searching the answer to the question I posed: "In relation to my case, what does it mean that the variances of the two conditions are significantly different?". Can you explain me this please? In addition, is the Bartlett test a good test for homoscedaisticity for my case? In R I found that test for homoscedaisticity. Do you have other suggestions? – L_T Sep 15 '12 at 09:12
  • More information about your research is needed to answer. Generally, if the difference is significant and if you have measured some attitude, then narrow-ranged case can say that under this condition people are uncertain about their attitude. Another example: people under tranquilizators will choose central options in questions that are measuring some emotional attitude. – O_Devinyak Sep 15 '12 at 10:29
  • @Luca Because the parametric ANOVA test requires that the groups each have normal distributions with the same variance checking the assumption can sometimes be important. Basically the F test for homoscedasticity looks at the ratio of sample variances to test whether or not the ratio of population variances differ significantly from 1. The test depends heavily on the normality assumption. So when normality fails robust tests of the equality of variance have been devised. – Michael R. Chernick Sep 15 '12 at 11:26
  • More generally, for discrete data and particularly discrete data such as Likert scales which are ordinal nonparametric tests for scale differences have been proposed. For a some probability distributions a variance may not exist but the concept of difference in variation between two such distributions remains valid. So nonparametric tests which look for differences in scale are applied in such situations. Your situation seems to fit this. Robust tests that are still comparing variances will nt be appropriate when the variances do not exist. – Michael R. Chernick Sep 15 '12 at 11:33
  • So I think the nonparametric scale tests that I referenced would be most appropriate for your situation. The Barlett test is really similar to the F test because it is also assuming normality and is sensitive to the assumption of nromality in the distributions for the two groups. so it is not as good as robust tests such as the Levene test or the Brown-Forsythe test. I suggest for your to not use the robust test either. Choose from the nonparametric tests for scale differences. – Michael R. Chernick Sep 15 '12 at 11:40
  • @fosgen which other information do you need? @ Michael Chernick I am not able to find in R the nonparametric tests for scale differences. Do you know the name of the function in R? – L_T Sep 15 '12 at 12:40
  • I don't know much about R and particularly do not know what people have coded for these algorithms. I think you could search for them in the CRAN library system and maybe you will find something there. – Michael R. Chernick Sep 15 '12 at 12:49
  • @Luca There are two possible reasons to perform homoscedasticity test. The first one - to check whether the assumptions of further parametric testing (ANOVA) is violated. This case is well described by Michael Chernick. R has two functions for nonparametric testing of scales in stats package (this package is loaded by default): ansari.test() and mood.test(). But I wonder: it is recommended to do nonparametric testing becouse of possible violations of normality. But when F test is suspicious due to lack of normality, then ANOVA is also suspicious. You may test the normality visually. – O_Devinyak Sep 15 '12 at 14:27
  • Compare qqnorm(residuals) against several qqnorm(rnorm(length_of_data)). If they are similar - go to var.test. If plot of your residuals is more extreme - go to nonparametric testing. And there is another reason to do homoscedasticity test - the difference in scales may be due to some qualitative difference between samples. To suggests what it can be exactly, there is need to know what are you measuring with the scale and who are your respondents. – O_Devinyak Sep 15 '12 at 14:33
  • Dear @fosgen tmany many thanks, but the problem is that I have to compare the homoscedasticity of 2 groups, one against the other. As I wrote at the beginning of this discussion, "the reviewer argued that the latter behavour could be due to the fact that participants answered more randomly, and he suggested that to verify this possibility, an homoscedasticity test should be performed comparing the two conditions." How can I perform such comparison? That is the point. Do you know a useful function in R to compare two groups? – L_T Sep 15 '12 at 15:12
  • I have already written these functions: parametric-var.test(), nonparametric - mood.test() and ansari.test(). – O_Devinyak Sep 15 '12 at 15:18
  • @fosgen I used the mood.test(). The results are reported in the answer below. Since I got p-vale > 0.05 I have to conclude that both the groups have an equal variance, and therefore I reject that the participants answered randomly as the reviewer suggested to test. Is this conclusion correct, right? – L_T Sep 15 '12 at 15:51
0

According to the comments above reported I found that the solution is to use mood.test().

The result is

> mood.test(scrd_non_interactive$Response,scrd_interactive$Response)

Mood two-sample test of scale

data:  scrd_non_interactive$Response and scrd_interactive$Response 
Z = 0.3895, p-value = 0.6969
alternative hypothesis: two.sided 

From this I conclude that the variance of the two groups is not statistically different.

L_T
  • 1,713
  • That would be fine if the reviewer will buy the idea that a Likert scale is like an interval measure. If he doesn't then you should do one of the nonparametric scale change tests. – Michael R. Chernick Sep 15 '12 at 15:59
  • but the mood test is non parametric... – L_T Sep 15 '12 at 16:38
  • It tests variances and not general scale when variance is undefined or does not exist. – Michael R. Chernick Sep 15 '12 at 19:21
  • Oh no. Then is that wrong? Do you have a name for the test you suggest? I am not able to find in R the right function.... Anyone who has a different suggestion from the Mood test kindly suggested by Fosgen? – L_T Sep 15 '12 at 22:03
  • I apologize Penfield mentions Siegel-Tukey, Mood and Normal Scores as scale tests. So you are okay with Mood's test for scale. – Michael R. Chernick Sep 15 '12 at 22:19
  • I am wondering how you are applying this. Don't you have multiple responses? Those responses would likely be correlated, too, making it difficult to correct for multiple comparisons. Note, too, that this test assumes that each group has the same location. I guess that's ok in this application, under the null hypothesis of no difference in responses between the groups, but conducting the test on residuals might be more convincing. These considerations suggest using Mood's statistic (or the Ansari statistic) but conducting a permutation test with it to compute p-values. – whuber Sep 18 '12 at 20:17
  • Hi...I do not know what you mean with "conducting a permutation test with it to compute p-values"...any suggestion? – L_T Sep 20 '12 at 07:43