We have a complicated experimental design that we would like to perform LRT analysis for. Our main goal is to discover significant genes for the "Injection:Social" interaction term across the entire dataset by removing it from the LRT reduced model, and as a bonus we are also interested in discovering significant genes for that interaction term for each respective brain region.
Sample Injection Social Region Individual ind.n
HY06 L ISO HY S06 S1
NST6 L ISO NS S06 S1
TN06 L ISO TN S06 S1
HY08 L ISO HY S08 S2
NST8 L ISO NS S08 S2
TN08 L ISO TN S08 S2
HY30 L KF HY S30 S1
NST30 L KF NS S30 S1
TN30 L KF TN S30 S1
HY32 L KF HY S32 S2
NST32 L KF NS S32 S2
TN32 L KF TN S32 S2
HY64 L KFC HY S64 S1
NST64 L KFC NS S64 S1
TN64 L KFC TN S64 S1
HY65 L KFC HY S65 S2
NST65 L KFC NS S65 S2
TN65 L KFC TN S65 S2
HY19 L NF HY S19 S1
NST19 L NF NS S19 S1
TN19 L NF TN S19 S1
HY24 L NF HY S24 S2
NST24 L NF NS S24 S2
TN24 L NF TN S24 S2
HY05 S ISO HY S05 S1
NST5 S ISO NS S05 S1
TN05 S ISO TN S05 S1
HY12 S ISO HY S12 S2
NST12 S ISO NS S12 S2
TN12 S ISO TN S12 S2
HY31 S KF HY S31 S1
NST31 S KF NS S31 S1
TN31 S KF TN S31 S1
HY34 S KF HY S34 S2
NST34 S KF NS S34 S2
TN34 S KF TN S34 S2
HY62 S KFC HY S62 S1
NST62 S KFC NS S62 S1
TN62 S KFC TN S62 S1
HY63 S KFC HY S63 S2
NST63 S KFC NS S63 S2
TN63 S KFC TN S63 S2
HY04 S NF HY S04 S1
NST4 S NF NS S04 S1
TN04 S NF TN S04 S1
HY20 S NF HY S20 S2
NST20 S NF NS S20 S2
TN20 S NF TN S20 S2
My first attempt was building simple full (m1) and reduced (m2) models that gets directly at our question of interest but doesn't control for nested individuals.
m1 <- model.matrix(~ Region + Social * Injection, colData_filt)
m2 <- model.matrix(~ Region + Social + Injection, colData_filt)
We want to control for individual/batch effects, which is nested within both "Injection" and "Social" but not region, as we have three brain regions per individual. I followed the example in the DESeq2 manual for creating a term (ind.n) distiguishing individuals nested within groups, but now I'm not sure how to create the full and reduced model given that I have one more level than the example.
I've tried a really elaborate full model (m1) with the interaction term of interest (Injection:Social) removed for the reduced model (m2), but I'm not sure this is correct based on our design.
m1 <- model.matrix(~ Injection + Injection:ind.n + Injection:Social + Injection:Region + Social + Social:ind.n + Social:Region + Region, colData_filt)
m2 <- model.matrix(~ Injection + Injection:ind.n + Injection:Region + Social + Social:ind.n + Social:Region + Region, colData_filt)
I'm assuming this is wrong, but even if this was by some miracle the correct formulation, would there be a way to extract genes that explain the "Injection:Social" interaction term for separate brain regions?
As a work-around, I subsetted the data by region and ran three separate LRT analyses for each subset and compared the results. While this simplified the model to look like the first example above, I worry that we lose some power by ignoring the fact we have multiple brain region samples from single individuals across the dataset.
Any guidance is much appreciated. Thanks in advance
"Injection:Social:Region"only if you includeRegion:Social. My point is this, do you think there is such an effect? – StupidWolf Jun 23 '20 at 18:41