5

I'm building a deep learning model to predict times of arrival. By definition, the time of arrival is always positive. I'm wondering if I can use a relu as the activation function of my last layer to force the predictions to be always non-negative. On one hand, seems intuitive that this should work, but I'm not sure if there's some effect that I'm not considering. Thank you for your help!

alexmolas
  • 288
  • 2
    Dying relu phenomenon could be problematic. An alternative approach would replace the standard regression losses (MSE, MAE) with the cross-entropy of a positive-valued random variable. See https://stats.stackexchange.com/questions/378274/how-to-construct-a-cross-entropy-loss-for-general-regression-targets for more information. – Sycorax Mar 16 '22 at 15:23
  • Any reason you're asking our permission instead of just trying it ;)? My two cents: use the softplus instead, which is everywhere differentiable with nonzero derivative. – John Madden Aug 29 '22 at 22:01
  • What about learning on the log scale? Check the distribution of your outcome, being it time interval is likely a mixture of exponentials (ie gamma distributed) – Bakaburg Sep 01 '22 at 05:49
  • I think the first thing to do is to check to your hypothesis. You can hypothesize that y is poisson. Than you can use exp as activation function. (But you can experiment relu or leaked relu as well) – Nathan Manzambi Aug 29 '22 at 18:21

0 Answers0