4

I have a dataset with (microarray) gene expression data that was sampled from the same individuals at multiple timepoints. Our exposure is a continuous variable, and because this was an observational study there is no consistent pattern of change in exposure over time (eg. one person's exposure may continuously increase, another's may go up and down, another may have no changes, etc.). The exposure is not expected to have any cumulative effect, so different timepoints within a person do not influence each-other. Our goal is to find genes where expression for the gene is associated with the level of exposure.

We are thinking of analyzing this as a hierarchical model, with timepoints nested within person (so a mixed model with a random effect for person). We will also need to control for a few covariates, some of which will be the same at all timepoints (race, sex) and some of which may change between timepoints (use of particular medications).

Is this something that it is possible to use limma for? I found that limma has a function called duplicateCorrelation() intended for block designs that some people have tried to use to fit mixed models, but I am not sure whether it can be used for this. Or if not limma, then what would you suggest?

llrs
  • 4,693
  • 1
  • 18
  • 42
bluemouse
  • 195
  • 4

1 Answers1

3

Yes, you can use limma for this mixed model approach. Like you suggest, the random effect (persons) can be put in duplicateCorrelation().

Here is a similar example with RNAseq data, on bioconductor support site.

benn
  • 3,571
  • 9
  • 28
  • From the help page of duplicateCorrelationEstimate the correlation between duplicate spots (regularly spaced replicate spots on the same array) or between technical replicates from a series of arrays.. So I wouldn't advise to use to account for this effect with this function. – llrs May 02 '18 at 10:01
  • @Llopis, did you also read Gordon Smyth's answer in the bioconductor link? Gordon Smyth is the creator of limma, and is professor in gene expression statistics, he knows what he is talking about. – benn May 02 '18 at 10:08
  • @Llopis It's not explicitly stated as such, but it's another use for that function (see, for example https://support.bioconductor.org/p/59700/ ). – Devon Ryan May 02 '18 at 10:08
  • @Devon, I know but it makes it harder to understand what is going on on my POV it would be easier to have it in the design matrix. b.nota, I did and I found that he also says that "You can't test for factor A. It doesn't really make sense to test for a random effect." Where A was the random effect on that question. – llrs May 02 '18 at 10:12
  • @Llopis, putting everything in a design matrix won't make it a mixed model but a linear model with only fixed effects. In the question in the link, the question was to get the effect of A, B, and C, and Gordon explains that it doesn't make sense to get the effect of A (random effect). – benn May 02 '18 at 10:22
  • Ok, then I need to review my understanding of mixed models, because I thought that adding them to the design matrix was a good way to handle them :\ – llrs May 02 '18 at 10:49
  • Don't worry @Llopis, most of the times you will have to deal with fixed effects. In OP's example here, they observed that the effect is random (up, down, or no change), that's when you have to consider using mixed models. – benn May 02 '18 at 14:25