1

When writing a deep learning paper, I need to train several CNN models and compare their performances. They are from different architectures so different designs.

I'm wondering should I use the same learning rate for all models when training (I've made sure they all have the same batch size, same loss function, etc), or should I create a naive tuner and find the best learning rate for each of the model?

Thank you in advance!!

Dwa
  • 111

1 Answers1

1

You cannot rule out that different architectures will benefit from different learning rates, so if you have the resources, you should give it a try.

Just make sure that you have a decent optimizer and that you have followed all the standard rules for learning DNNs.

frank
  • 10,797
  • Thank you! So if I have to use different learning rate for each one, do I need to indicate what I used for each one in the paper, or do you think I don't include such detailed information like the learning rate in my paper? Thank you again! – Dwa Aug 25 '22 at 15:08
  • 1
    It doesn't hurt to add it. Maybe you provide an appendix for all the details. – frank Aug 25 '22 at 15:41