How to use a t-test to determine the best values from a set of scores

Asked Oct 06 '22 at 04:18

Active Oct 06 '22 at 04:40

Viewed 19 times

The scores are $R^2$ values of around 20 machine learning classifiers. Which of the scipy statistical tests methods would be appropriate here? (https://docs.scipy.org/doc/scipy/reference/stats.html#statistical-tests) I'd like to see the best scores, for example, "according to a t-test with p <.05". For example, the classifier results might look like this:

Classifier	Score
RF	0.42
MLR	0.12
ANN	0.71
SVR	0.72
...	...

The best scores according to a t-test with p <.05 might be something like "0.71" and "0.72".

edited Oct 06 '22 at 04:40

asked Oct 06 '22 at 04:18

ds02

Welcome to Cross Validated! What makes you see a t-test as appropriate for your task? I don’t see it but am open to learning. // Perhaps you can expand on what exactly you’re doing. For instance, depending on how you calculate $R^2$, you might not be shielding against overfitting the training data. (And if you’re a Python user, you’re probably even calculating $R^2$ with a function with which I disagree, though that is less of a concern of mine right now.) – Dave Oct 06 '22 at 05:16
r2 are probably not enough, you will need target/prediction for each data point in the test set or at least sample size. – rep_ho Oct 06 '22 at 08:42

How to use a t-test to determine the best values from a set of scores

0 Answers0