When writing a deep learning paper, I need to train several CNN models and compare their performances. They are from different architectures so different designs.
I'm wondering should I use the same learning rate for all models when training (I've made sure they all have the same batch size, same loss function, etc), or should I create a naive tuner and find the best learning rate for each of the model?
Thank you in advance!!