If multiple [cost, gamma] values yield identical or extremely similar cross-validated performance (regardless of the specific metric - ACC, F1, MCC), then what is the technique which should be used to select which [cost, gamma] to train with?
I have seen cases where equal (and also approximately equal) performance results can be gained from various different [cost, gamma] values which are contradictory (for example, being very low, or very high) found in different areas during a grid search.
Should I take the average of [cost, gamma] over the range where performance is identical? Otherwise, is it better to choose a low cost and a high gamma, for example?
Please note, this question is not a duplicate (as suggested in the comments):
(1) This question is about a data set which is balanced. Therefore, accuracy is a reasonable measure of performance; any value over 50% means that the result is better than random; still, this question is not specific to accuracy... that is just one possible metric, see (2).
(2) This question is not about whether or not accuracy is a scientifically valid metric. This question applies to any model performance evaluation metric; it could be any measure. If you have several equal c,g points of measure x (be it ACC ROC, AUC PR, F1, MCC, or any other metric,) how can you select which c,g is more appropriate?
But accuracy is incredibly deceptive, even if your data are balanced, since it gives no distinction between marginal and confident decisions. For more information, please search our archives and review https://stats.stackexchange.com/questions/312780/why-is-accuracy-not-the-best-measure-for-assessing-classification-models?noredirect=1&lq=1
– Sycorax Mar 26 '18 at 05:09