I have fitted my data to a linear model in R using generalized least squares accordingly:
mod<-gls(Y ~ A * B+Z,
data)
and used weights due to heteroscedasticity as follows:
mod_varpower<-update(object = mod,weights=varPower())
and then I run ANOVA (first question here, lets say A and B are factors and Z is a continuous variable, then I have basically done an ANCOVA right?)
Anova(mod_varpower) # Analysis of variance
which gives me the following output:
> Anova(mod_varpower) # Analysis of variance
Analysis of Deviance Table (Type II tests)
Response: Y
Df Chisq Pr(>Chisq)
A 2 4.9937 0.08234 .
B 2 7.3418 0.02545 *
Z 1 6.3873 0.01149 *
A : B 4 7.9339 0.09403 .
Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
As we can see here I don't have a significant interaction between A and B, according to Engqvist 2005 I should therefore ditch the interaction effect. However, now to my question. If I do look into contrasts of contrasts I get the following:
emmip(mod_varpower, A ~ B) # look at interactions
emm_s.t <- emmeans(mod_varpower, ~A*B) # look at contrasts
emm_s.t
> contrast(emm_s.t, interaction = "pairwise")
A_pairwise B_pairwise estimate SE df t.ratio p.value
Metabolite - Normal Metabolite - Normal -0.171 1.63 140 -0.104 0.9170
Metabolite - Low Metabolite - Normal -2.036 1.42 172 -1.430 0.1545
Cotton - Low Metabolite - Normal -1.865 1.43 186 -1.305 0.1934
Metabolite - Normal Metabolite - Low -2.448 1.49 138 -1.639 0.1035
Metabolite - Low Metabolite - Low -3.246 1.33 185 -2.441 0.0156
Cotton - Low Metabolite - Low -0.798 1.39 183 -0.574 0.5666
Metabolite - Normal Cotton - Low -2.277 1.47 183 -1.550 0.1229
Metabolite - Low Cotton - Low -1.211 1.37 156 -0.881 0.3796
Cotton - Low Cotton - Low 1.066 1.45 167 0.733 0.4646
Degrees-of-freedom method: satterthwaite
where the 5th row from above shows a p-value below 0.05 which means we have a significant interaction if I'm not mistaken?
So on which basis can we remove the interaction term as proposed by Engqvist 2005? Should I go for the ANOVA output or the contrast of contrasts?
EDIT
If I look at the correlation the two are highly correlated and linear to my understanding. I must mention that stats is not my strong side, some questions might be a bit obvious.
> Mod<-cor.test(dat$Z,dat$Y, method = "spearman",use="complete.obs")
> Mod
Spearman's rank correlation rho
data: dat$Z and dat$Y
S = 1131356, p-value < 2.2e-16
alternative hypothesis: true rho is not equal to 0
sample estimates:
rho
0.6047263

A surprising number of people object to this.
– Peter Flom Jul 08 '23 at 13:26As for the Z. I've added an edit with a correlation plot for Z and the rho value. As you mentioned in the last paragraph, Z being linearly related to Y is a strong assumption but if I have the correlation plot and literature that shows that Z have a strong positive influence on Y.
Although A and B could also influence Z. Therefore I want to make sure that what I see influencing Y from A and B is actually due to A and B and not Z. I don't know if this makes sense to you.
– Blanca Jul 08 '23 at 18:30Zin the model. My suggestion is to try something more flexible than a simple linear adjustment for it. The Spearman correlation test you report is non-parametric, only evaluating the ranks of observations. It doesn't indicate linearity. See this page and its links. Even if you did a Pearson test for linear correlation, a curvilinear association ofZwith outcome could still have a strong Pearson correlation coefficient. – EdM Jul 08 '23 at 21:33