I know that when validating we are interested in knowing how the model performs in real-world scenarios, so we want the class ratios during validation/test to be the original ones.
Say, however, that we are performing some kind of parameter search/optimization. If we are comparing different possible configurations, I guess we should never use validation loss to compare the models, since if this loss is not weighted, the minority class has little "representation" and therefore we could be choosing a model that is performing good on the majority class but not so good on the minority one. We should instead use a metric that considers both classes equally. Is that right?
I believe this reasoning would not only apply to comparing models but also for lr schedulers that take into account validation metrics. Torch's ReduceLROnPlateau uses a validation metric to adjust learning rate. In their examples they use validation loss but for the same reason I just stated, I believe this might not be the best idea when we have data imbalance.
I know there are posts that somehow answer this but I have not found any that argues about model comparison or lr scheduling using validation loss.