Within scientific literature, there is a tentative proposal to change the significance level from p=0.05 to p=0.005
http://www.nature.com/articles/s41562-017-0189-z
I understand there is a lot of nuance to this proposal and don't necessarily want to get too much into the pros and cons!
In order to test the performance of this proposal in the real world, I have collated primary endpoint p values for a large number of scientific studies, and have assigned liekert-score type ordinal data to describe the value of the study, where 1= low importance study and 5= highly important study (based on a complex calculation taking journal impact factor, number of citations, H-score and a few other factors into account).
So I have two columns of data as follows:
Column A: 1-5 (Ordinal) - Where 1 = Low Importance Study; 5 = High Importance Study
Column B: 0-1 (Categorical, Dichotomous - Does the study primary endpoint meet p= <0.005 YES or NO - Represented as 1 or 0)
I can see visually from the data that the primary endpoint of most low value studies does not meet the significance threshold of p=0.005 (20%), but most high value studies do meet it (89%). Using the breakdown from studies I have analysed so far, the breakdown according to likert scale is as follows:
1: 20% meet p <0.005
2: 63% meet p <0.005
3: 85% meet p <0.005
4: 93% meet p <0.005
5: 89% meet p <0.005
If I group studies scoring 1-2 as "Not valuable", and studies scoring 3-4-5 as "Moderately/Very Valuable", I get:
1-2: 45% meet p <0.005
3-4-5: 89% meet p <0.005
I am wondering how I can describe this better in statistical terms, and what test would be appropriate here to describe the association with study value and the binary metric of meeting p = <0.005. In laymans terms, I would like to describe the efficiency of this new threshold at identifying and excluding low quality papers, as well as its performance in identifying but preserving high quality papers.
Is Spearman's rho appropriate here? Or would I be better off trying to describe this using receiver operating curves and with the language of sensitivity and specificity etc?
For interest, my data is here https://ufile.io/k3abnh1s