I have a 10k lines dataframes on which I want to perform ANCOVA so I can get adjusted means.
Please note that I've never done this before so I jump from a tutorial to another, but I still want to make it the right way.
So my model is like Y ~ X * sex, with
Ythe dependant variable (continuous)Xthe continuous independant variablesexthe discrete independant variable (here the sex)
Reading this tutorial, I could calculate the Y mean adjusted on X for each sex :
model = aov(Y~sex*X, data=x)
data.predict = data.frame(sex=c("Male", "Female"), X=mean(x$X, na.rm=T))
data.frame(data.predict, Y=predict(model, data.predict))
This gives realistic results, but I realized that anova(aov(Y~sex*X, data=x)) and anova(aov(Y~X*sex, data=x)) give very different results. The calculated means are the same with both models though.
Reading the EdM answer in the question https://stats.stackexchange.com/a/213358/81974, I tried with the car package and Anova(model, type="III"), and this time both give the same results.
I don't really understand how it could matter, but it seems that my data are unbalanced (the aov help "Note" says that it could be misleading).
Knowing this, are the previously calculated adjusted means still usable ?

Error()term in the model formula. Otherwise,aov()is essentially the same aslm(). Also, generally you should not compute EMMs for factors that are included in an interaction in the model, unless you do them separately for the interacting factor. The warning message to that effect was apparently omitted from the output shown here. – Russ Lenth Jan 09 '18 at 01:35emmeansfunction is very interesting and give the nearly same results as my hand computations, which is nice. But as you stated my question was not about how to compute them but about whether I am allowed to do so. Anyway, please keep your post alive as I'm sure it will be very useful to other people. – Dan Chaltiel Jan 09 '18 at 10:21aovwill give different results in anova tables depending on the order of the factors in the model. As your results suggest, this does not affect the adjusted means whether you use your method from predicted values or theemmeanspackage. You might also note that the results fromsummary.lmwill be the same for either model:modela = lm(Y ~ X*Sex, data = Data); summary(modela); modelb = lm(Y ~ Sex*X, data = Data); summary(modelb)– Sal Mangiafico Jan 09 '18 at 12:22