Can you check the significance of your AUC values?

Question

I have computed some AUC values from the ROC curve based on logistic regressions. Firstly, I have divided my two datasets (D1, D2) into three different drivers, let us call them L La, CC, and kept them all in one set too LLaCC.

The data is split into 80:20 train:test, respectively (N > 1,000,000 data points).
The logistic regression is performed on the train dataset.
The model is evaluated via the area under the curve method on the test dataset.

Therefore, we have AUC values for the two datasets (D1, D2) and the four drivers (L La, CC, and LLaCC).

     L     La     CC    LLaCC
D1  .5     .6    .89      .93
D1  .5    .75    .81      .86

I have been asked if these differences are significant, I assume within and between groups. But, I do not know whether or not this is even possible? I mean is this not too few estimates to even compare them statistically? NB. No this is not a school assignment.

Dichotomous statistical "significance" is rarely meaningful and is especially not meaningful here. For example you can have "significant" differences in AUROC that are trivial with large enough N. And AUROC can be too insensitive for performance comparisons. See here. Data splitting is a bad idea unless N > 20,000. Resampling is better. Data splitting is too subject to the "luck of the split". — Frank Harrell, Jan 31 '22 at 12:57
For which curve are you measuring the area? ROC curve? PR curve? Something else? — Sycorax, Jan 31 '22 at 13:20
But, here the total number of AUROC values are no more than 8 values. Further, I have datasets of more than 1m data points, far beyond the N > 20,000, this should not be a problem according to your statement. I know there are other performance comparisons e.g. Brier Score, but I am not using that one. — Thomas, Jan 31 '22 at 14:39

Can you check the significance of your AUC values?

0 Answers0