I am currently looking at this paper: https://www.nature.com/articles/s41591-021-01487-3 The equation for (1) includes a variable for a baseline value. I am confused as to why they do this as I thought this may cause problems in terms of being correlated with the error term. Any explanation as to why?
Asked
Active
Viewed 327 times
1 Answers
0
It seems like the baseline value $y_{i0}$ in
$$ y_{it} = \beta _0 + \beta _1\rm{High}_i + \beta _2y_{i0} + {{{\boldsymbol{X}}}}_{{{\boldsymbol{i}}}}\beta _3 + \varepsilon _{it} $$
plays a similar role as an offset in the regression model, though it's not clear to me why they made it a feature rather than an offset. It doesn't differ much from the scenario where you would model $y_{it} - y_{i0}$ as the dependent variable.
Tim
- 138,066
-
I went on the link for the offset in the regression model but still I'm not understanding what it exactly means – Lily Tee Apr 26 '22 at 22:23
-
@LilyTee offset modifies the dependent variable: multiplicative in Poisson regression because of log-link, additive in linear regression. But in the paper, they make it a usual independent variable. It serves as "base level" at the start. Making it a feature, not an offset, makes it more flexible because offset has a parameter fixed to 1, while otherwise it is estimated, so the base level is proportional to $y_{i0}$. – Tim Apr 27 '22 at 10:36