2

We have the PISA data with a multi-level structure (student - school - level). The schools were selected randomly. In this case it is well known to use multilevel methods. Considering subsamples, we can use multilevel models for level 2 subsamples (for example, schools from a given region). If we want to study the effects of a level 1 variable, such as gender, the usual approach is to consider models with gender as a factor. We can take a different approach by using a multilevel model from a level 1 subsample? For example, considering only female students and only male students, one multilevel for each subsample. Is this a methodological error? Thanks, David

  • I'm not sure I follow. You are asking if you can use gender as a random effect? – Shawn Hemelstrand Jan 05 '24 at 13:15
  • 1
    I think OP means dividing the sample by gender and conducting separate multilevel analyses for men and women. If so, this is not a good idea. It's much better to run your analyses on the full sample and include gender as a predictor and a gender*predictor interactions, if those are relevant. – Sointu Jan 05 '24 at 14:01

1 Answers1

2

I generally agree with @Sointu's comment. That said, one could make an argument for conducting a study that focused on girls in which you only looked at girls in the sample. You might be interested in understanding whether girls with differing motivational profiles have higher/lower achievement than other girls. In that case, the entire project is focused on a particular group and the theory and research questions revolve around within-group processes and concerns.

But unless we are misunderstanding your question, arbitrarily dividing the sample between boys and girls and running separate regressions on them is questionable. Some would call it a fishing expedition or perhaps wandering down a garden of forking paths. Why not divide it some other way? There needs to be strong theoretical justification when going down this route.

Erik Ruzek
  • 4,640
  • 1
    Thank you for your answers. I am primarily interested in analysing the use of the variance partition coefficient (VPC) as a measure of equity in education, using maths scores as the response variable, so I need to calculate this parameter separately for both male and female students. The other option is to compare the VPC of the null model with the VPC of the model with the gender variable as a factor, but I find this more difficult to interpret. It is not an arbitrary choice of subsampling (or fish/fork), but a clearly defined research objective. – David Gutiérrez Rubio Jan 05 '24 at 17:02
  • 1
    This is helpful context. Have you looked at the work of Evans et al. (2019) and then Keller and colleagues (2023)? https://doi.org/10.1016/j.socscimed.2019.112499 https://doi.org/10.1007/s10648-023-09733-5 – Erik Ruzek Jan 05 '24 at 17:25
  • Thanks Erik, I will take a look a it! – David Gutiérrez Rubio Jan 07 '24 at 12:21
  • Sorry, I assumed this was about fixed effects. If the interest is in variance components I understand the idea of splitting the sample. – Sointu Jan 08 '24 at 08:54