I am looking into how random initialisation of a model would impact final results after tuning. This is a well known problem for deep learning (NN or gbdt), notably with random initialisation and stochastic learning processes.
It seems to me that modern implementation of a simpler model, such as linear/logistic regression, can suffer from this. I suggest this because the sklearn implementation of logistic regression allows for setting the random seed parameter.
I know that vanilla linear regression, being a form of projection, only has a unique solution. Logistic regression, as a convex optimisation problem, has a unique solution too but, given its non-linear 'activation', hasn't an analytical solution. Uniqueness of a solution also depends on collinearity of the features and regularisation added (If X1 and X2 are perfectly colinear, error is constant along the lines X1 + X2 = cst, allowing multiple solutions). Some of the problem seems to appears with stochastic learning algorithms, when parameters are not set correctly (not enough learning steps), but sklearn seems to issue a warning if the algorithm didn't converge.
I am wondering about concrete examples, where the logistic regression could converge to different results with different random seeds, why it happens, if it can happen 'silently', and if there is a general rule of thumbs to avoid this.
Do you have any concrete example of logistic regression converging to different values, depending on initial seed? (Preferably with real life data, ideally with a serious source, and, if applicable, a problematic case where sklearn wouldn't issue a warning.)