paired t-test with ordinal data

Question

Can I do a paired samples t-test when my data are ordinal? My data are reading levels at time 1 and time 2, but the data levels are A, 1 2,3,4,6,8,10, 12, 14, 20, 24, 28, 30, 34, 38, 40, 50, 60, 70, 80. They appear to be continuous but they are not. Can I still use a paired t-test? These are the reading levels from k - 8th grade. I was able to run a paired t-test on the continuous data, which was a standardized test with grade level equivalency, but I am not sure if I can also do it for the reading level data.

That doesn't seem ideal. Do you also have covariates you want to control for (eg, the actual grade the child was in)? — gung - Reinstate Monica, Jan 20 '16 at 02:15
What do those levels actually represent? Are they for example, labels for intervalized measurements or counts (e.g. does "70" represent "the child correctly read at least 70 target words, but fewer than 80")? — Glen_b, Jan 20 '16 at 02:17
Note that "can I" may not be the question you want to ask -- no doubt you can do it, it's a straightforward calculation to carry out -- but you probably want to know something else -- e.g. (i) what impact there could be on significance level/power in some situation or situations or (ii) whether some unspecified person or persons might think it acceptable. Please try to say what you really want to know, keeping in mind that you may need to give more information then. — Glen_b, Jan 20 '16 at 02:25
These level are students levels on the Developmental Reading Assessment (DRA). I have covariates like English Language Learner and Special Education Classification as well as attendance in a separate regression analysis.
I wanted to see the impact of the reading intervention on students DRA levels. They should ideally be at level 6 at the end of Kindergarten, but many students are far above that level. And to Glen's point, I asked the question based "can I" because I did not know if I could. I am a novice and thought this was a safe place to pose any question. — Lynette, Jan 21 '16 at 15:26

score 1 · Answer 1 · edited Jan 27 '16 at 17:59

1

Because this is ordinal data, the assumptions that the data follow a normal distribution will be violated. Given that the assumption of normality is violated, a typical paired t-test in this situation would at best lack sensitivity, and at worst provide spurious estimates. Fortunately there are non-parametric versions of the t-test which do not depend on the assumption of normality, and so are quite suitable for ordinal data.

For this data, I would suggest the signed-rank test. It is designed for paired comparisons on non-normal data.

Here is an example in r:

## first construct our samples to test
# pool of possible ordinal values
# not continuous, however numerical order assumed valid
pool = c(1, 2,3,4,6,8,10, 12, 14, 20, 24, 28, 30, 34, 38, 40, 50, 60, 70, 80)   

# sample 1, randomly chosen from pool values
test1 = sample(pool, 100, replace = TRUE)

# sample 2, randomly chosen from pool values
test2 = sample(pool, 100, replace = TRUE)

# sample 3, pool values, weighted towards higher values (those at end of pool)
prob_vec = 1:length(pool)/sum(1:length(pool))
test3_weighted = sample(pool, 100, replace = TRUE, prob = prob_vec)

## run the sign rank test
# test1 vs test2 should not have significant difference, they are both chosen at random
wilcox.test(test1, test2, paired = TRUE)
# V = 1849.5, p-value = 0.1985

# test1 (or test2) vs test 3 should be significant, test 3 is weighted towards 
#  higher values
wilcox.test(test1, test3_weighted, paired = TRUE)
# V = 1221, p-value = 8.495e-05

edited Jan 27 '16 at 17:59

gung - Reinstate Monica

145,122

answered Jan 27 '16 at 17:55

timle

136
1

This is a reasonable start. From the comments, I gather the OP would like to control for covariates. How might you adapt / extend this analysis in that situation? – gung - Reinstate Monica Jan 27 '16 at 18:02
In the case of controlling for covariates, paired tests can't do much.
However, if a regression model is being employed, in R it is trivial to convert these 'reading levels' into factors. This is simply done with the factor() function. Once converted, the regression model will treat the values as levels, and not measures of magnitude.

The workflow would go something like: Run regression model, include 'reading levels' as factors.

Do follow up tests, between groups of interest, using the wilcox.test procedure outlined above.
– timle Jan 28 '16 at 19:35
Well, there is ordinal logistic regression (of which many classical nonparametric tests are special cases). There are mixed effects versions of OLR that are appropriate for repeated measures data (eg, see my answer here: Is there a two-way Friedman's test?). – gung - Reinstate Monica Jan 28 '16 at 19:45
Those are excellent suggestions, though I would start with 'reading level' as factor first in a conventional regression framework. If it appears that this regression model violates the assumption of Homoscedasticity I would absolutely move to non-parametric regression options as you suggest. Based on the information given, though, it is only possible to ascertain that paired a test on this categorical data should be of the non-parametric variety. There is no evidence (given in post) to suggest that the same would be true of the regression model. – timle Jan 28 '16 at 19:54

paired t-test with ordinal data

1 Answers1