0

I'm running a linear regression in SPSS to test for effects of a binary variable (X) on cost of hospital admission. The variable is correlated with a cost increase of around $3000W

When the model has variable X and a few other correlated categorical variables, I get a B value of $2400 for X.

As soon as I add length of stay, I get a B value for X of -$2400.

What might be causing this? I already tested for multicollinearity, all of my VIF values are less than 1.1. I haven't had any success switching to a generalized linear model.

I have tried transforming every variable with log, inverse, square root, etc. Length of stay and cost have a fairly linear relationship (Rsqu = ~.42).

I am convinced at this point that the issue is with how the variable is coded or how I am imputing it. However, recoding the variable X from 0 and 1 to 1 and 2 or 10 and 20 did not help. Has anyone had this problem before?

James
  • 1
  • 1
  • Re your final question: all the time. See https://stats.stackexchange.com/search?q=regression+coefficient+switch+sign for instance. If the duplicate doesn't apply in your case, then please include more details in your post. – whuber Mar 02 '24 at 17:27

0 Answers0