1

Let's say I have 3 independent variables (No. of Relationships, Health, and Salary) and I put them into a multiple linear regression to predict Happiness.

What would it mean if this regression was significant ( p<.005), but only the No. of Relationships variable had a significant coefficient?

For instance, if I was to say that No. of Relationships, Health, and Salary predicted Happiness in this model, would that be factually correct? Additionally, would it be factually correct to say that people can increase their happiness by increasing their no. of relationships, health, and salary?

The model predicted happiness, but surely this is only being caused by the No. of relationships variable, as no other variable had a significant coefficient?

Hopefully, this makes sense. I'm just a bit stuck on how to interpret non-significant variables in a significant MLR.

  • https://stats.stackexchange.com/questions/3549 addresses many aspects of this question--perhaps it helps. The interpretations of the variables are unaffected by your sense of significance, but your decisions about including them in the model or not might be affected. I write "might" because if theory tells you to include a variable, usually you should include no matter what the p-value may be. – whuber Jul 18 '22 at 18:31
  • Thanks @whuber that was a helpful link. So essentially I can say that even though health and salary were not significant coefficients, they still predict happiness when added to a model featuring no. of relationships, health, and salary. – Andrew Drewmore Jul 18 '22 at 19:49
  • They might or might not predict happiness. A high p-value indicates your data don't suffice to draw such a conclusion with enough confidence. That's why theory can be important: if it says those variables might be predictive, then failing to detect that fact in your data is no reason to drop those variables! – whuber Jul 18 '22 at 20:22
  • I see. So just for a conclusive example, can we say that while the model featuring no. of relationships, health, and salary significantly predicted happiness (p = .001), the only sig coefficient was no. of relationships (p <.001) and not health (p = .412) or salary (p = .800). So, essentially we can conclude that only no. of relationships significantly predicted happiness in this sample... However, as theory shows all variables should be predictive, it's uncertain whether these results would be found outside of this sample? – Andrew Drewmore Jul 18 '22 at 22:07
  • As a note: this conclusion is assuming there is no multicollinearity ^ – Andrew Drewmore Jul 18 '22 at 22:16
  • And -- based on this analysis -- you cannot tell people that they can increase their happiness by increasing the number of their relationships, their health and salary. (Though people might not be surprised to hear so.) Building a model to predict happiness from the characteristics people have doesn't imply that you have model that predicts what would happen if a person acts to change those characteristics. – dipetkov Jul 18 '22 at 23:28
  • You may have no multicollinearity but your independent variables may be correlated. You can determine if there is correlation between variables and them combine or drop the ones that are correlated. The foundation is that if two of them are correlated, knowing one gives all the information about the other so there is no added information when considering both of them. The other option is to restate the model using both the variables and their interactions. Numerically this is cleaner. – LDBerriz Jul 19 '22 at 12:26
  • In your current model the variables that have coefficients > 0.05 are supposed to be random events that do not explain the dependent variable. Also pay attention to whuber's comment about dropping or not dropping variables in the context of the theory (or hypothesis) you are testing. If the theory indicates all three variables should be significant you may have a random sample that did not capture this fact. – LDBerriz Jul 19 '22 at 12:27

0 Answers0