0

I have an ever increasing dataset and am starting to look at analysing some of it.

Here is a very simple example:

Hypothesis: People with higher levels of expertise will be able to correctly identify stimulus of high quality (HQ).

Summary Table:

Expertise  | CORRECT (freq) | WRONG (freq)
-----------|----------------|-------------
Very Low   |    3           |    0
Low        |    6           |    3
Middling   |    1           |    2
High       |   18           |    4
Very High  |   10           |   11
-----------|----------------|-------------
    Total      38           |   20

So (for example) 3 people with "very low" expertise identified the HQ stimulus; 11 "very high" identified LQ stimulus.

Test: Not sure but is a Chi Squared test the right approach for testing the hypothesis and if so can someone please set out a worked example based on this sample? That way I can apply it across a few other very similar data samples looking at other variables (than expertise - I have a few).


Additional details

I am confused by the comments a little so will try to explain as best I can. Please do ask more.

The experiment: 58 people were assessed for their expertise in a subject and categorised into one of five bins (very low, low, middling, high and very high). They were then asked to evaluate two pieces of stimulus and choose which piece they believed to be correct - one piece was factually correct (high quality HQ), the other had a number of errors (low quality LQ).

At the end we had the results in the table above. Everyone (N=38) in the CORRECT (HQ) column was correct, the rows give the breakdown of those who were correct by their expertise. Obviously 20 people got the evaluation wrong and are in the WRONG (LQ) column.

The hypothesis (above) stands; people with greater expertise will be able to correctly identify the high quality stimulus.

Is that clearer?

What I want Can a test be used with this data to accept/reject this hypothesis? If so what, if not, why and what would be needed data wise?

BarneyC
  • 153
  • 1
    I'm a bit confused by your hypothesis. How do we know if anyone has correctly identified anything based on your data set? – dsaxton Mar 04 '16 at 14:54
  • Sorry I thought I have covered that just below the table, starting "So..." – BarneyC Mar 04 '16 at 14:55
  • By which I mean; the freg(HQ) column are correct observations. e.g. only 1 person with "middling" expertise correctly identified the HQ stimulus. Does that make more sense? – BarneyC Mar 04 '16 at 14:57
  • The text restates some of what's in the table, but the table itself doesn't tell us who is "right" or "wrong," which is what your hypothesis is about. – dsaxton Mar 04 '16 at 14:59
  • 1
    No one will be able to help very much until you explain your experimental design in more detail, and spell out exactly what value you are trying to quantify or which alternative hypotheses you want to compare. – Harvey Motulsky Mar 04 '16 at 14:59
  • Another possible way of posing the question might be; "Does expertise in a subject influence ability to identify factually correct stimulus?" – BarneyC Mar 04 '16 at 19:03
  • 1
    So every person contributed a single observation? Do you have the underlying expertise data that could be used instead of the categorization? Note that a chi-squared test would not incorporate information about the ordering of the expertise categories. If you don't have the actual expertise categories, can you make a reasonable guess at what the numbers would be for a latent expertise scale? – gung - Reinstate Monica Mar 04 '16 at 19:30
  • @gung to assess expertise each respondent answered three questions with a Likert (1-5) where 5 was Very High. These three scores aggregated (e.g. someone answering 3,4,5 scored 12) and assigned to a category. From memory 12 was a high expertise. – BarneyC Mar 04 '16 at 19:57

2 Answers2

4

If you want to prove a directionality, as in "people with greater expertise will be [better] able to correctly identify the high quality stimulus" than you need a test that incorporates the direction. A chi-square or similar test finds differences, but doesn't demonstrate direction.

One way to proceed would be to use the raw results of your Likert scale (evidently from 3 to 15) as a measure of "greater expertise" and perform logistic regression of correct/wrong against that scale. That examines the log odds of making a correct choice as a linear function of the Likert scale value. If each extra point on the Likert scale is an equivalent increase in what you mean by "expertise" then the result of that regression would be a good test of your hypothesis.

That said, your hypothesis is not looking very good at this point. Over half the "Very High" group made errors according to your table, versus only 1/3 of those in the lower 3 classes and only 18% in the "High" group. The possibility of such a pattern emphasizes that, as for any linear model, you would have to test the linearity assumption of your logistic model; data thus far might suggest a peak in the relation between success on the test and your measure of expertise.

Other ways to proceed are provided in the answer on this page by Andrey Bortsov, and on this Cross Validated page.

EdM
  • 92,183
  • 10
  • 92
  • 267
  • Thanks for that tip. Were the hypothesis rejected I wouldn't be at all upset - it's based on literature and I'm looking for evidence to support it not holding true. Will go off and have a good look at log regression – BarneyC Mar 07 '16 at 10:17
3

According to your description, you can use a Cochran Armitage test for trend, which is a modification of the Chi-square test. The null hypothesis is that the conditional probability of being wrong is the same in all categories of expertise (just like the Chi-square), but this test yields more power if there is a trend in frequencies of one variable across the ordered categories of the other variable. However, it has less power if the pattern is nonlinear. Here is how you would test it in R:

require(DescTools)
x <- cbind(c(3, 6, 1, 18, 10), c(0, 3, 2, 4, 11))
CochranArmitageTest(x)

Output:

    Cochran-Armitage test for trend

data:  x
Z = -1.3879, dim = 5, p-value = 0.1652
alternative hypothesis: two.sided
  • Can you explain a little about how the test works? – Glen_b Mar 05 '16 at 09:46
  • Thanks for contributing to the site and providing more information on the test . The site tries to be a repository of useful answers that go beyond the original questioner's needs, so a bit of extra explanation beyond the specific question can be a good idea. Links to further information on sites that are likely to be maintained (e.g., this site, Wikipedia) can also be useful. – EdM Mar 05 '16 at 15:49