0

Suppose that we have a 3-waves dataset.

We have some qualitative states for each wave, let's call them State 1 and State 2.

Let's construct a change variable Change:

  1. Change is all 0s at Wave 1 (baseline),
  2. Change is 1 or 0 in Wave 2 (if Wave1state != Wave2state, 1, otherwise 0),
  3. Change is 1 or 0 in Wave 3 (if Wave2state != Wave3state, 1, otherwise 0).

Here is an example dataset:

id t change
1 1 0
1 2 1
1 3 0
2 1 0
2 2 1
2 3 1

And finally, let's fit a within-between model to model this change:

library(panelr)
wbm(change ~ IV1 + IV2 | IV3 | (1 | id), data = d, family = "binomial")

My questions are as follows:

  • Is it OK to use Wave 1 with all 0s in the dependent variable for this specification?
  • How should I interpret the coefficients, given that the approach demeans the variables and there are all 0s in my dependent variable for the first wave?
  • If this approach produces some sort of bias, what should I do instead?
  • For wave 1, is everyone in the same qualitative state or can it be either state? Should we truly assume there are exactly two qualitative states or is that for simplicity's sake here? – dcoy Jul 14 '22 at 18:54

1 Answers1

0

(I might have to change this if your answers to my comment-question surprise me.)

The model, as you are describing it, is modeling within-unit changes in their propensity to change from a previous state, regardless of what that state was. Decomposing the within-between effects means that your within-unit coefficients are modeling the change in propensity to change from a previous state that results from the change in IV1 and IV2, controlling for each unit's overall propensity to change during the panel and each unit's average "levels" at IV1 and IV2. This is pretty confusing, and your effects will be totally agnostic toward the qualitative features of each qualitative state: it's only reporting (within) change in propensity to change and (between) panel-average propensity to change. Assuming this really is what you want...

Re: "Is it OK to use Wave 1 with all 0s in the dependent variable for this specification?"

Assuming Wave 1 = 0 has nothing to do with which qualitative state units are in, I think you would not want to include this wave. The model will not interpret this as a "baseline" but as equivalent to state persistence from the pre-panel wave. It will inform the coefficient in the same way that persistence would from wave 1 to wave 2 and wave 2 to wave 3. The within-unit coefficients are equivalent to including unit dummies in the model or demeaning from the unit-mean over the panel. Having that first wave will bias the estimates by effectively adding a zero to the calculation of the unit mean from which waves 2 and 3 deviate.

More generally, there is evidence that generalized linear models (like this one) do have some inherent bias in the estimation. It's often quite small, though. You could run this two-wave model and compare your within-unit coefficients to those of a traditional fixed-effects logistic regression panel model: see examples of packages.

dcoy
  • 337