In this hypothetical wrong procedure, it sounds like the number of regressors is chosen to minimize the error over the whole data set. In this case, the model will engorge itself on regressors until using all of them, because the error will always decrease as more are added. The reason is that the error is evaluated using the same data used to choose the weights. This allows the model to overfit (i.e. to fit random structure in the training data that isn't representative of the underlying distribution that produced it). This means that the error will be optimistically biased; when run on new data from the same distribution, the model will have greater error, and will regret its former gluttony.
Regarding answer #1, 'too complex' means more regressors are chosen than should have been, leading to overfitting. This assumes that the 'proper' model includes a smaller subset of the regressors.
That said, using stepwise regression is generally not a good idea in the first place (e.g. see here).