5

I have Likert data from a survey that unfortunately I did not get to design, and I would like to see if there are any ways to assess if an intervention is having an effect (or rather, to assess participants' perceptions of whether the intervention is having an effect). I don't have baseline data, only post-intervention; I recognize the many major limitations of the data but I'm hoping something is salvageable!

Our "intervention:" we made medical specialists available via web chat to generalist clinicians that did not previously have access to specialists, so that they could ask for advice regarding complicated patients.

Evaluation data: we have no "before" data, and this was obviously not blinded or randomized. We just have survey data from clinicians who used this resource. Data is currently in Google Docs, hah, but could be exported somewhere else (Stata and Excel ideally). The questions we asked are as follows:

I changed elements of my patient's care based on what I learned from the specialist.

1 = Strongly disagree (my patient's care did not change)

5 = Strongly agree (my patient's care changed)

What I learned from the specialist allowed me to implement care which improved my patient's condition.

1 - Strongly disagree (there was no improvement)

5 - Strongly agree (there was improvement)

My knowledge about a particular medical condition improved as a result of the WhatsApp case discussion.

1 - Strongly disagree

5 - Strongly agree

The predicament: how should we present this data?!? We can calculate very simple summary stats (e.g. the mean), but it would be nice for a medical education paper we'd like to write to say something more statistically meaningful about whether our intervention was better than doing nothing. I thought about just comparing our mean to a null hypothesis of m=3 (or m=1 -- we debated about what the null should really be) but I've read that Likert data doesn't work that way. Here are the two options suggested to me so far:

Two proportion test: "bin" the data (either 1-2-3 vs 4-5, or 1-2 vs 3-4-5) and compare our proportion of "successful" responses to whatever portion we think would have occurred if our intervention had no impact or a negative impact. (If we did this, any recommendations on how to choose the null hypothesis proportion?)

Chi square test: if our null hypothesis is that the responses are random, and our intervention isn't causing any positive effect so people are just selecting their survey answers randomly, then we could use a Chi square test to compare the distribution of our answers to a hypothetical normal distribution centered around 3. I'm intrigued by this idea, but I'm not sure how to actually implement it mathematically.

Do those ideas sound reasonable? Are there other ideas to analyze our Likert data to show that what we're doing is better than nothing... without actually having a control group / counterfactual / baseline data?

In case it wasn't clear, I have decent classroom training in statistics thanks to my medical education but have not worked with them on real projects before. Will happily delve into any resources you all can provide, though! Thank you so much.

mt.MD
  • 53
  • 3
  • Do you just have pre and post intervention data or do you have it collected over time? – costebk08 Dec 10 '19 at 04:09
  • Are you interested in analyzing each question individually or combining the responses for several questions into a scale? – Sal Mangiafico Dec 10 '19 at 10:58
  • @costebk08 - I only have post-intervention data, unfortunately. – mt.MD Dec 13 '19 at 05:17
  • @SalMangiafico I'm probably going to analyze each question individually, because they measure pretty different things (impact on patient care vs. clinician learning). – mt.MD Dec 13 '19 at 05:18
  • The general advice for Likert item data is to treat it as ordinal in nature. That is, the responses are really categories, not numbers, but you know how to order them (Strongly agree is greater than Agree, and so on). So you don't want to use tests that assume the variable is numeric (like t test and actually the Wilcoxon signed rank test)... It sounds like you want a one-sample test (because you don't have two groups to compare). One approach is the one-sample sign test, with which you might use 3 as the neutral response. That is,
  • – Sal Mangiafico Dec 13 '19 at 13:51
  • you are asking if you had significantly more 4 and 5 responses than 1 and 2 responses. Another approach like this is the two-proportion test you suggest. There's nothing wrong with the chi-square test you propose, but I don't think the results will be very useful because you need invent the distribution you expected. Note also for this test that "at random" might imply a uniform distribution of answers, whereas you also propose a null of normally distributed about the neutral response.... For summary statistics, proportions or bar plots for each response in each question is very useful,
  • – Sal Mangiafico Dec 13 '19 at 13:56
  • if it's not overkill for the audience. But note, it is also sensible to report medians for ordinal data. The one-sample sign test I mentioned can also be interpreted as being about the median of the data.
  • – Sal Mangiafico Dec 13 '19 at 13:58