0

I have a data with 2 variables: diagnosis- yes/no Score- numeric variable from 0-10. I need to do ROC analysis for this data and to find the best cut off values.

The problem is the data is too small so I can't split it to train and test data.

I understand I need to do build a bootstraped data set, build the model and then see how it functions on the original data and repeat this process X times. From what I have read I can achieve that by rms:validate in r.

However, this gives me only the general AUC without the ability to run and plot models with different cut-offs of the score. Is there another way to achieve my goal? Thank you

  • Welcome to Cross Validated! The Wynants article discussed in this Cross Validated answer discusses small sample sizes leading to unstable optimal thresholds. Sure, you might run bootstrap well and get the optimal threshold to be $0.3$, but if the optimal thresholds in your bootstrap interactions has an interquartile range of $0.1$ to $0.5$, how much would you trust that $0.3$ threshold? – Dave Feb 26 '23 at 15:31
  • Well it's not that small. I have 167 observations, but the patients with positive diagnosis is only 27. Anyway I have to run the analysis so just searching for the most 'correct' way to do it. – Inbar Lavie Feb 26 '23 at 16:15

0 Answers0