0

I developed a ML algorithm (Xgboost) to predict a target in my data set. I obtain here the results of my predictions on my test set :

tibble(
  truth = factor(c(0, 0, 0, 0, 0, 0, 0, 1, 0, 0)),
  response = c(0, 0, 0, 1, 0, 1, 0, 1, 1, 0),
  prob_0 = c(0.8829266, 0.8831959, 0.7404993, 0.3993190, 0.6625459, 0.3192227, 0.6028344, 0.2362246, 0.4525665, 0.9415646),
  prob_1 = c(0.11707342, 0.11680406, 0.25950068, 0.60068098, 0.33745408, 0.68077731, 0.39716560, 0.76377536, 0.54743350, 0.05843538)
)

I understand that the accuracy here corresponds to this code :

num_total <- nrow(pred_dt_poumon)
num_success <- sum(pred_dt_poumon$response == pred_dt_poumon$truth)
accuracy_poumon <- num_success / num_total

But I'm stuck now, because I don't know what to comparate to have the p-value of this accuracy. Do I must compare the response column versus the truth ? Or the num_sucess versus 0.5 (hazard) ? Or the prob versus the ground truth ? it's not a code question but way more a methodology one.

Nicolas
  • 13
  • I'm not sure what you mean by p-value of this accuracy? A p-value is used to provide evidence in support of (or against) a testable hypothesis. Have you thought through your hypotheses and formulated your Null ($H_0$) and alternative ($H_1$) hypotheses?

    Or perhaps you are trying to obtain the probability of each observation being of a given class?

    Better yet, why don't you simply tell us what you are trying to accomplish? What is it that you are interested in investigating or understanding with statistics with regard to your ML algorithm?

    – StatsStudent Jun 12 '23 at 08:11
  • @StatsStudent In fact, if I obtain an accuracy with my algorithm on my test set, and if I do some subgroup analysis (like in my case), I suppose I must have prove that this result is not the fact of hazard (?) so that is the purpose of my question. I think that mu H0 is : My test did not do better than random predictions. So I did a binomial test to test if my accuracy is better than 0.5 – Nicolas Jun 12 '23 at 08:14
  • Then, take a look at the following answer (and others on Cross Validated) which should address your question: https://stats.stackexchange.com/questions/368176/how-to-determine-whether-a-classifier-is-significantly-better-than-random-guessi – StatsStudent Jun 12 '23 at 08:18

0 Answers0