Why doesn't SGD optimizer converge to same Coefficients for multiple runs but Statsmodels does?

Asked May 29 '23 at 09:53

Active May 29 '23 at 13:17

Viewed 29 times

I am using SGD & Adam optimizer in a simple NN for a binary classification problem.

The model converges every time, but when I run the same model for 100 times, I notice that I won't get the same coefficients every time. There is a deviation in coefficient values from benchmark coefficients (that I obtained from Statsmodel) of around 8-10%.

Is there a specific reason why this happens, or we have any proofs around non-convergence to same coefficient values every time for SGD/Adam optimizers ?

I am implementing a simple NN model using TF library. Below are the Hyper-parameters I am using:

Epoch: 300 Optimizer: ADAM/SGD Activation: Sigmoid Learning Rate: 0.005 Batch Size: n/4 (n is # of data points) Regularization: L1/L2

asked May 29 '23 at 09:53

dsdscnt03

1

To get the same coefficients, you need to set the seed. It's natural to have different solutions with iterative algorithms. – gunes May 29 '23 at 10:08

Why doesn't SGD optimizer converge to same Coefficients for multiple runs but Statsmodels does?

0 Answers0