For a personal project, i'm trying to figure out how to trace some ROC/AUC graph with my current problematic. I have a list of thunder flashes, and i'm trying to find if a meteorological variable (CAPE) is a good predictable variable (or proxy) for theses flashes. In my thoughts, i want to evaluate this proxy and find the best threshold (for exemple 1000 J/kg-1) giving me the best score.
After a few days learning the ROC/AUC fonctionnement, i can't succeed to create the best confusion matrix, and in a global manner, the ROC graph. I take for each hour of each day, the value of the CAPE and i check in my list if there is a thunder at the same time. My confusion matrix (for a CAPE threshold of 1000) is :
Thunder event ? | Positive | Negative
---------------------------------------------
Yes= flashes. |Cape>thresh|Cape<threshold
|+flashes. |+flashes
---------------------------------------------
No=no flashes. |Cape>thresh|Cape<threshold
|no flashes |no flashes
---------------------------------------------
So in my mind: if CAPE value is > threshold and there is a thunder flash, then TP if CAPE value is > threshold but no flashes, then FN if CAPE < threshold but thunder flashes then FP and if CAPE < threshold but no flashes then TN
But there is something missing in my comprehension to trace some ROC/AUC. With this classification it cannot works because i have a very huge amount of TN, so my point at (1,1) in my roc graph is in fact at (1,0).
With my problematic, is it possible to evaluate my CAPE proxy and find the best threshold with a ROC/ graph?
Thank you very much for any help, i'm not familiar with this kind of statistics.
Best regards!