I have 1 dependent variable and 33 independent variables (continuous, categorical & dichotomous). Correlation analyses (2-tailed) show that the DV is only correlated to 7 of the IVs although most of the correlations are very weak, e.g. about 0.1 or less than 0.1.
Is it correct to put only IVs that are correlated with the DV into the regression model?
P.S. What's the use of the correlation matrix (1-tailed) produced with the regression analysis?
a) There seems to be strong correlations between some of the predictors, e.g. whether someone has taken Biology and whether someone has taken Chemistry. However, it seems not appropriate to just drop either Biology or Chemistry. What should I do?
b) You're right that a predictor might be important when other variables are present. So, should I simply put all 33 predictors in the regression model since there's very little literature on which might be the predictors? That is, this is more exploratory in nature.
– statistics newbie Aug 25 '15 at 13:26