I've been reading more about linear regression and the assumptions it makes. What are the consequences of violating some of the assumptions involved?
For instance, you need the dataset to exhibit little or no multicollinearity. What happens if I ignore this?
For instance, suppose I have a dataset where x_3 and x_4 are just duplicates of each other (and thus perfectly correlated). The only difference I can see is that, instead of having one line of best fit, I can now have infinite:
E.g. if the line of best fit without x_4 had Bx_3 in it, then I can just distribute B amongst x_3 and x_4. So if B = 5, I could do 4x_3 + x_4 or 3.5x_3 + 1.5x_4, etc.
So, it seems like:
1) Maybe it's not disastrous to violate the multicollinearity assumption?
2) Is it much worse to violate the others / if so, what are the consequences of doing so?
Thanks!