1

Edit with graph:

I am struggling a bit conceptually to make sense of a result I get when applying a linear mixed model to my reaction time data.

I have a 2x2 within subjects design. When I plot the data by means of an interaction plot, one of the two lines is above the other, with non-overlapping confidence intervals. However, when I apply a linear mixed-model, which looks like this:

model26 = lme(log(RT_times) ~ location*task, ~1+location*task|participant,data= data,method='REML',weights = varComb(varIdent(form=~1|location*task)),control =list(msMaxIter = 1000, msMaxEval = 1000))

I don't find any significant main effect. This is the output:

Linear mixed-effects model fit by REML
Data: data_sac

Random effects: Formula: ~1 + task * condition | pp Structure: General positive-definite, Log-Cholesky parametrization StdDev Corr
(Intercept) 0.2479765 (Intr) tskndf cndtnv taskundef 0.1391700 -0.708
conditionvalid 0.1722409 -0.672 0.651
taskundef:conditionvalid 0.1848967 0.652 -0.627 -0.990 Residual 0.2490666

Combination of variance functions: Structure: Different standard deviations per stratum Formula: ~1 | condition * task Parameter estimates: invaliddef validdef invalidundef validundef 1.0000000 0.8943147 0.8514028 0.8917650 Fixed effects: log(latency) ~ condition * task Correlation: (Intr) cndtnv tskndf conditionvalid -0.680
taskundef -0.688 0.646
conditionvalid:taskundef 0.628 -0.938 -0.673

Standardized Within-Group Residuals: Min Q1 Med Q3 Max -7.10755334 -0.40245682 0.02502696 0.51551241 4.18246501

Number of Observations: 5209 Number of Groups: 56

enter image description here

To the contrary, the p-value for task is about 0.7. I find this very strange, as for another dataset with a comparable graph, I do instead get significant results. Now, I do get that the computation of the 95% CIs and the linear mixed model are different, so they might lead to different results, but I don't get how they can be SO different. There does not seem to be anything wrong with my data, I even removed outliers etc, so it is difficult for me to grasp what is going on.

enter image description here

Hope the question is clear now. Many thanks for any insight you might provide!

SinC
  • 21
  • @EdM thanks for your quick comments! I have edited my question and I hope that now it is clearer. – SinC May 17 '23 at 08:26
  • @SextusEmpiricus I updated the post, thanks a lot for the answers – SinC May 17 '23 at 08:27
  • How many measurements do you have and how many participants? Are the confidence intervals also computed with the assumption of random effects? – Sextus Empiricus May 17 '23 at 10:31
  • In particular, how many observations are there per individual for each location/task combination? You might be trying to fit too complex a random-effect model for the data that you have. What happens if you use a simpler intercept-only random effect instead of this model with both an intercept and four slopes as random effects? It also would help to show summaries of the random-effect estimates in addition to the fixed-effect estimates that you display. Were there any warnings returned when you fit the model? – EdM May 17 '23 at 11:54
  • So, I have in total 56 participants. Each has 25 trials per task/location combination (it is not much data per subject as this is a subset of my original data and this is an exploratory analysis). The confidence intervals are simply those obtained as a default output from python seaborn. When I do not model the random slope, the main effect of task is significant (p=0.03). There were no warnings in the outputs from the model, but I am adding the full outputs for better clarity – SinC May 17 '23 at 12:11
  • 1
    Your situation might be as "the influence of adding a (random) effect". By considering a random effect you are effectively reducing the degrees of freedom. A study based on 56 measurements or 1400 measurements, that can be a big difference in the estimates of the standard error. This difference is not visible in your confidence intervals which are computed based on the assumption that the 1400 measurements are independent. – Sextus Empiricus May 17 '23 at 12:49
  • There's a difference in predictor terminology among your displays. Some say "condition" (valid vs. invalid) while others say "location" (left vs. right). Please make sure that we are seeing commands and outputs for the same model in all of the displays. Also (admittedly, unlikely to be a problem) your interaction plot is on a linear scale, not the log scale the your model uses. – EdM May 18 '23 at 12:12