0

For a personal project, i'm trying to figure out how to trace some ROC/AUC graph with my current problematic. I have a list of thunder flashes, and i'm trying to find if a meteorological variable (CAPE) is a good predictable variable (or proxy) for theses flashes. In my thoughts, i want to evaluate this proxy and find the best threshold (for exemple 1000 J/kg-1) giving me the best score.

After a few days learning the ROC/AUC fonctionnement, i can't succeed to create the best confusion matrix, and in a global manner, the ROC graph. I take for each hour of each day, the value of the CAPE and i check in my list if there is a thunder at the same time. My confusion matrix (for a CAPE threshold of 1000) is :

Thunder event ?   |  Positive | Negative
---------------------------------------------
Yes= flashes.     |Cape>thresh|Cape<threshold
                  |+flashes.  |+flashes
---------------------------------------------
No=no flashes.    |Cape>thresh|Cape<threshold
                  |no flashes |no flashes
---------------------------------------------

So in my mind: if CAPE value is > threshold and there is a thunder flash, then TP if CAPE value is > threshold but no flashes, then FN if CAPE < threshold but thunder flashes then FP and if CAPE < threshold but no flashes then TN

But there is something missing in my comprehension to trace some ROC/AUC. With this classification it cannot works because i have a very huge amount of TN, so my point at (1,1) in my roc graph is in fact at (1,0).

With my problematic, is it possible to evaluate my CAPE proxy and find the best threshold with a ROC/ graph?

Thank you very much for any help, i'm not familiar with this kind of statistics.

Best regards!

  • I think this questions originates in misunderstanding the ROC curve. Constructing a ROC curve plots how TPR/FPR change as you change the value of threshold, not a single value of the threshold. The duplicate thread explains this in more detail. – Sycorax May 07 '22 at 17:34
  • I have already read this thread, it didn't help me for my problematic. – yonafunu May 07 '22 at 17:38
  • Your problem arises because what you’re doing is not how a ROC curve is constructed. You need to compute the TPR and FPR for each value of the threshold, and then plot TPR and FPR. Again, this is described in the duplicate. – Sycorax May 07 '22 at 18:03
  • Ok, then what is my best option to find the best threshold and to evaluate my proxy as a predictable variable of thunder flashes ? Thanks – yonafunu May 07 '22 at 18:33
  • ROC analysis is an option to find a suitable threshold, but you've not applied it correctly. Asking for the best method is a different question than what you ask here, so it's best to just ask a new question using the ASK QUESTION button at the top of the page. Bear in mind that to determine the "best" method, you'll need to specify what the criteria are for comparing methods. – Sycorax May 07 '22 at 18:46

0 Answers0