I performed a two-way ANOVA and I am trying to understand how the mean of each group can be found from the coefficients of intercept from each variable. From my understanding, the base group is partner_status=high, fcategory=high.
import statsmodels.api as sm
from statsmodels.formula.api import ols
moore = sm.datasets.get_rdataset("Moore", "carData", cache=True) # load
data = moore.data
data = data.rename(columns={"partner.status" : "partner_status"}) moore_lm = ols('conformity ~ fcategory+partner_status+fcategory*partner_status',data=data).fit()
print(moore_lm.summary())
The intercept coefficients give the mean of the group partner_status=high, fcategory=high which is 11.85. However for the variable fcategory[T.low]:partner_status[T.low], the intercept is -9.26. But the mean of that group is 8.9 (from the groupby results). How do we get the mean of that group from the coefficients?
coef std err t P>|t| [0.025 0.975]
Intercept 11.8571 1.731 6.851 0.000 8.356 15.358
fcategory[T.low] 5.5429 2.681 2.067 0.045 0.120 10.966
fcategory[T.medium] 2.4156 2.214 1.091 0.282 -2.063 6.894
partner_status[T.low] 0.7679 2.370 0.324 0.748 -4.026 5.561
fcategory[T.low]:partner_status[T.low] -9.2679 3.451 -2.686 0.011 -16.247 -2.288
fcategory[T.medium]:partner_status[T.low] -7.7906 3.573 -2.181 0.035 -15.017 -0.564
Below is the group results of the mean for each category.
conformity
partner_status fcategory
high high 11.857143
low 17.400000
medium 14.272727
low high 12.625000
low 8.900000
medium 7.250000
fcategory[T.medium]andpartner_status[T.low]coefficients. – EdM Jan 27 '23 at 15:51