I tried to use a linear model to explain a variable "age" with two variables "x1" and "x2".
I can clearly see a decreasing slope inside my scatterplot for age vs x1, or age vs x2, and the pvalue of each coefficient of my model is <0.001, but there is too much dispersion (R2 of my linear model is < 0.5).
I don't have much information, so i would like to add a "prior" information, the "exact distribution" of "age" inside a population, using the population censing of my country (which then does not depend on "x1" or "x2" ...). Hope this helps to narrow down the error. I'm not sure if I can do that, and if yes, how ?
I'm sure the linear model is a suitable model to explain the relationship between age and x1 and x2, but it cannot make precise predictions.
It can't make an accurate prediction on the age variable, because x1 and x2 don't capture much of the observed variation, but age depends linearly on x1 or x2, the relation is well explained
– Knz Sep 23 '22 at 13:36