is there a way to design a mixed model for uneven number of measurements per subject and (more importantly) with uneven time intervals between measurements which are taken at different time points (the dataset contains observations during several years)?
I have data about pig reproductive traits. The goal is to determine if a particular mutation (snp) affects the traits. Sample data:
| pig_id | measurement(order) | breed | year | snp | age(days) | y |
|---|---|---|---|---|---|---|
| 1 | 1 | A | 2020 | AA | 250 | 330 |
| 1 | 2 | A | 2020 | AA | 290 | 290 |
| ... | ... | ... | ... | ... | ... | ... |
| 1 | 80 | A | 2021 | AA | 600 | 320 |
| 2 | 1 | B | 2016 | BB | 330 | 400 |
| 2 | 2 | B | 2016 | BB | 350 | 385 |
| 2 | 3 | B | 2017 | BB | 365 | 360 |
The biggest problem I see is that measurement 1 for pig 1 is a completely different time point (date) than measurement 1 for pig 2.
Using SAS, I wanted to try something like this:
proc mixed data=have;
class pig_id breed year snp measurement;
model y = age interval breed year snp measurement;
repeated measurement / subject=pig_id(snp) type=SP(POW);
run;
where interval would mean number of days from the last measurement. But I am not sure if it can fix the problem above.
I also considered giving every unique date of observation its "serial number" (so instead of measurement_order I would use a time point from 1 to n), but then I end up having thousands of levels for fixed effect of time...
So is there a solution within these mixed models, or is my only chance to dig into unevenly spaced time series?
lme4functions don't seem to provide as much flexibility for that as do other packages, but they seem to work well in practice. – EdM Oct 23 '22 at 16:43time = 0reference makes the most sense, so that for each observation you would include the age at that observation as a time covariate, to model smoothly and flexibly. In case there are systematic differences among animals depending on the year in which they were born, you might include an additional covariate representing something related to the actual date of birth. That additional covariate would be the same value for all observations of an individual (like breed or SNP). If you don't expect differences related to date of birth, no need to do that. – EdM Oct 24 '22 at 16:53