4

It may be easier to show you visually what I am trying to compare:

enter image description here

This is what the table for one of the conditions might look like: enter image description here

The graph above plots the mean values at each time point just so it can visualised for an audience.

Simply speaking, I am looking for a way to tell if Condition 1 produces higher counts than Condition 2 or 3 or Control etc. I have performed a repeated measures ANOVA with multiple comparisons (Tukey correction) and wanted to check if this was an appropriate way of comparing the data. If it is, then is the best way to compare simple effects within rows or simply to compare column means.

Alexis
  • 29,850
  • 3
    "I am looking for a way to tell if Condition 1 produces higher counts than Condition 2 or 3 or Control etc". It does. That seems obvious ftom the plot. – Robert Long Jun 27 '20 at 14:46
  • Haha I agree. I just thought statistically is there anyway to confirm this? I have a series of data, where the differences aren't as obvious. – user28999 Jun 27 '20 at 14:51
  • 2
    So you seek a p-value to confirm what you are already know ? :) Generally I would advise againt such a thing, but if I had to a rmANOVA seems like a good option. A mixed effects model is another option though with only 4 groups, perhaps not – Robert Long Jun 27 '20 at 14:55
  • 1
    Hang on a minute, can you explain the experimental setup / study design ? The table in the question is just for 1 condition ? What are the columns ? – Robert Long Jun 27 '20 at 15:02
  • Sorry, I can clarify. As you say the table shows one condition. The columns represent pressure measurements. So each experiment is carried out five times and I record pressure at 10 seconds, 20 seconds etc. Each column is a separate experiment and as you say, each "Condition" has a separate table like this. (I apologise for not making this clear before) – user28999 Jun 27 '20 at 15:07
  • Ahh ok, no worries, but that changes things a bit. So what are the points on the plot ? Are they the means of the 5 replications for each condition or is that plot just one of the 5 replications ? And what are the bars on the plots ? – Robert Long Jun 27 '20 at 15:25
  • The points are means of the 5 replications. The bars are 95% confidence intervals. – user28999 Jun 27 '20 at 15:27
  • 1
    OK. But there is a downward, near-linear, trend in the plot and rmANOVA will not handle that. Also, I see that your time axis isn't linear, so it is probably more of an exponential decay than a linear decay. What does the underlying theory suggest about the time trend ? A mixed effects model might be best here. Can you post a link to the data ? – Robert Long Jun 27 '20 at 15:52
  • Hi, I'm really sorry, I just needed to clarify with members of the team - they have advised me no to publish any more data unfortunately bar the bits above as it's still unpublished. I realise this makes it difficult for you to help me. Essentially what we're looking at is the pressure generated when certain substances are mixed with certain solutions and that's where the "Conditions" come from - so for instance, placing salt in soda is one condition (condition 1) and this would be compared to salt with water (control) and salt with a different type of soda (condition 2) etc. – user28999 Jun 27 '20 at 16:31
  • No need to apologise, most people don't post their data :) I should be able to post an answer, but it would be useful if you can replace the plot with one that has a linear time scale – Robert Long Jun 27 '20 at 17:10
  • I've changed it now. Hopefully this is clearer. – user28999 Jun 27 '20 at 17:19
  • The answer is dependent on what "higher" means for you. Does it mean higher on average ? Higher at the beginning ? Does the sum of all repeated measurements at one condition means anything ? What is your definition of "higher" ? – Rodolphe Jun 27 '20 at 21:23
  • Hi Rodolphe, That's a really good point and perhaps I was a bit vague in my question. I was looking for a way to compare the values at each time point and determine if they were significantly different which led me down the path of rmANOVA to begin with. – user28999 Jun 27 '20 at 22:23

1 Answers1

6

Here you have repeated measures (5 replications) of an experiment where you made measurements over 10 time points under 4 experimental conditions.

Repeated measures anova doesn't take account of the downward temporal trend here.

One alternative is to use multivariable regression, where you regress the count variable on time, with the addition of the condition variable. If time is a continous variable, that would be an ANCOVA model. To handle the repeated measures you can just include an indicator variable for the replication number. In R you would run:

model <- lm(pressure ~ time + replication + condition, data = mydata) 

One issue I see is that, initially there appeared to be a fairly obvious linear trend, which would be captured by the estimate for time variable - provided that it is coded numerically - but actually the time points are not equal, they are steps of 10 up to 60, but then steps of 30. Now that you have updated the question with a different plot, it becomes more noticeable. One way to handle this is to code the time variable as discrete, since you are not interested in the time trend itself. This would seem a sensible first step. Depending on the model fit you might them want to explore a model with time as numeric such as a piecewise linear model with 2 pieces (0-60 and 60-180) or a transformation to obtain a linear slope - such a log transform.

The estimates for condition will answer your research question. If you code the condition variable as a factor with condition 1 as the reference level then you will get 3 estimates for condition (control, condition 2 and condition 3). However, the interpretation will be different depending on how you handle time. With time as a numeric variable each of the estimates for condition will be the expected difference between condition 1 and the other levels at time=0 - so in this case you may want to centre the time variable aroud zero to make this more meaningful. If time is coded as a factor then the estimates for condition will be relative to whatever the reference level of time is. You can change the reference level to get different contrasts, though this can become unwieldy with so many time points. In order to look at the whole range of time, I believe you can use functions from the emmeans package, but this is not something I usually do so I can't comment much further than that.

Another alternative is to use a mixed effects model, with random intercepts for replication. Arguably, 5 replications is too few to fit random itercepts but I would still consider it. The random intercepts take care of the repeated measure. The base model formula would look something like this:

model <- lmer(pressure ~ time + condition + (1|replication), data =  mydata)

The considerations about the time and condition variables above would equally apply here.

lmer is a function from the lme4 package in R.

Robert Long
  • 60,630
  • Thank you very much for your detailed reply! I will certainly attempt this. – user28999 Jun 27 '20 at 18:23
  • You're welcome. Good luck with it :) – Robert Long Jun 27 '20 at 18:27
  • Is degree of freedom fully adjusted when only random intercept are considered ? I believe random slopes should also be taken into account to completely correct the models degree of freedom, i.e. to get correct p-values. A I mistaken ? – Rodolphe Jun 27 '20 at 21:20
  • @Rodolphe perhaps I am not sure what you mean, but as far as I can understand your point there shouldn't be a problem. If you disgree please please post a new question about it. – Robert Long Jun 27 '20 at 21:37