1

I am currently looking at this paper: https://www.nature.com/articles/s41591-021-01487-3 The equation for (1) includes a variable for a baseline value. I am confused as to why they do this as I thought this may cause problems in terms of being correlated with the error term. Any explanation as to why?

jbowman
  • 38,614

1 Answers1

0

It seems like the baseline value $y_{i0}$ in

$$ y_{it} = \beta _0 + \beta _1\rm{High}_i + \beta _2y_{i0} + {{{\boldsymbol{X}}}}_{{{\boldsymbol{i}}}}\beta _3 + \varepsilon _{it} $$

plays a similar role as an offset in the regression model, though it's not clear to me why they made it a feature rather than an offset. It doesn't differ much from the scenario where you would model $y_{it} - y_{i0}$ as the dependent variable.

Tim
  • 138,066
  • I went on the link for the offset in the regression model but still I'm not understanding what it exactly means – Lily Tee Apr 26 '22 at 22:23
  • @LilyTee offset modifies the dependent variable: multiplicative in Poisson regression because of log-link, additive in linear regression. But in the paper, they make it a usual independent variable. It serves as "base level" at the start. Making it a feature, not an offset, makes it more flexible because offset has a parameter fixed to 1, while otherwise it is estimated, so the base level is proportional to $y_{i0}$. – Tim Apr 27 '22 at 10:36