I'm a little fuzzy on the exact assumptions needed for mixed/fixed effects models. As an example, let's say we're trying to model the effect of age on a person's 5k time, and we have a dataset of race times by person, by year (so multiple observations per person at different ages).
From what I remember, the naive OLS way would be to regress race time on age. We don't want to do this because there are multiple observations per person, which violates the OLS assumption of independent observations. We can introduce a random effect for person to the model, which "allows" for each person to have their own intercept and lets us see the within-subject effect of age on race time. I believe this is a pretty standard way to deal with multiple observations per subject.
However, what's the difference between:
- Using a mixed-effects model as specified, and
- Using a fixed-effects model but using dummy variables for each person? In essence, why can't we just regress
race timeonage+Person A+Person B+ ..., wherePerson [x]is a dummy for a particular person in the data? Isn't this also effectively allowing each person to have their own intercept?