Pardon my ignorance here. While analyzing data based on multiple cohorts(7 cohorts), if there is one cohort that is contributing very small number of study sample, relative to data from other cohorts, is it worth including that cohort in my analysis. What are the disadvantages of including this cohort ?
This is how much data I have from each cohort, C4 is the issue.
C1 C2 C3 C4 C5 C6 C7
200 350 1654 17 1101 412 331
The study objective : Impact on exposure(A) on child's mental development(Y).
The outcome, child's mental development is evaluated based on Bayley's Mental Development Index (MD124). This is a continuous variable.
The exposure, here is a mercury, manganese, cadmium. This is a time varying variable, measured during baseline and two other followup visits.
For each data on outcome(y), I have three data points on exposure at time1, three data points on exposure at time2, three data points on exposure at time3 . So Cohort1 is contributing data on 200 unique children, Cohort2 350 unique children so on.
SubjectID CohortID Y Exposure1 Exposure2 Exposure3 Time
1 C1 51 12.2 10.5 11.7 Baseline
1 C1 53 12.5 10.4 11.5 Followup1
1 C1 54 12.6 10.2 11.6 Followup2
2 C1 51 12.1 10.1 11.7 Baseline
2 C1 53 12.2 10.2 11.1 Followup1
2 C1 54 12.4 10.3 11.2 Followup2
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
1 C7 51 11.2 12.5 11.7 Baseline
1 C7 53 11.5 11.4 11.5 Followup1
1 C7 54 10.6 9.2 11.6 Followup2
2 C7 51 11.1 12.1 11.7 Baseline
2 C7 53 12.2 12.2 11.1 Followup1
2 C7 54 9 .4 9 .3 11.2 Followup2
I am planning to include cohort id in the model to estimate the cohort effect.