I have a disease rate calculated for people aged <50 (early-onset) and aged 50+ (late-onset), for every US county. The rate ratio is the early-onset rate divided by the late-onset rate for every county. Then I calculated the correlation coefficients (CC) between the rate ratio and 40 county-level risk factors (e.g., obesity prevalence). If the CC is positive, then I think the risk factor has a stronger association with the early-onset rate than with the late-onset rate.
However, I found that all risk factors had a positive association with the rate ratio. This seems strange as I think at least some risk factors should have a stronger association with the late-onset rate.
Then I found that the range and variance of the early-onset rate are much greater than those of the late-onset rate. This is due to that there are much fewer people with early-onset disease than those with late-onset disease. So the early-onset disease rates are less stable due to the small number of cases. Thus, for the rate ratio, the numerator has a much higher variance than the denominator. I think that is why all risk factors are positively associated with the rate ratio. In an extreme case, if there is no variance in the denominator, the association between the rate ratio and risk factors would be the same as the association between the early-onset disease rate and risk factors.
I tried standardizing the early-onset disease rates to the range of the late-onset disease rate (same method in this post). Then they had a similar variance and the same range. The CCs also showed both positive and negative associations between the rate ratio and risk factors. However, I am not sure if this is a valid way to obtain the direction of associations. If it is a valid way, is there any published paper to support it?