Having done an interesting Coursera module on Linear Regression, and being provided with the starting values for Gradient Descent, I am wondering about some things that were not touched upon, and, for which I could not find the answer to:
- What is the way to estimate the initital starting values for the weights? I do not seem to find anything here.
I think it does not matter if you have infinite computing power and time.
But, may be I am missing something obvious.