1

I am investigating the relationship between scores on 3 questionnaires (SPQ,CAPS,PDI) and the effect of an experimental 'Condition' on performance (Correct/Incorrect). I have run the following logistic mixed effects models in lme4 (random effects removed for readability here):

  1. Correct ~ PE_Condition*spq

  2. Correct ~ PE_Condition*(spq+caps+pdi)

I then ran model 2 with additional covariates, but this did not converge. There is some evidence of multicollinearity between predictors, which has been handled by centering all the predictors.

Currently the SPQ variable is a summary score for the questionnaire. However, this questionnaire can be interpreted differently to generate 3 separate subscale scores. I want to investigate these 3 subscales.

QUESTION: I'm not sure which approach is best to do this:

  1. Separately run a)Correct ~ PE_Condition*(spq1+caps+pdi) b)Correct ~ PE_Condition*(spq2+caps+pdi) c)Correct ~ PE_Condition*(spq3+caps+pdi)

  2. Correct ~ PE_Condition*(spq1+spq2+spq3+caps+pdi)

SilvaC
  • 512
  • What is the difference between models 2 and 3 ? They seem identical. Can you tell us more about the study design. spq, caps and pdi are separate questionnaires, right ? Where do you see the collinearities ? So far I would agree with Peter about retaining the full dataset – Robert Long Dec 14 '23 at 19:21
  • Apologies; I originally meant to illustrate model 3 as having additional covariates- now corrected above. – SilvaC Dec 15 '23 at 10:47

1 Answers1

3

First, unless there are interaction terms or quadratics or cubics etc., centering cannot get rid of collinearity. See this thread and, in particular @FrankHarrell 's answer. I'm not sure what happened in your case. (Unless the collinearity is with the intercept, in which case see @CoffeeJunky's answer, but you say "among the variables).

Second, in general, I think it's better to run one equation. You can deal with collinearity by ridge regression. Ridge regression for multilevel models has also been discussed here a little, e.g. this thread but there doesn't seem to be a huge literature on this. Other ways of dealing with colinearity tend to make it hard or impossible to include all the variables, at least, as separate variables, which you say is a goal.

Peter Flom
  • 119,535
  • 36
  • 175
  • 383
  • Thank you! Is there a particular reason that you recommend one model? Second, I conducted centering as I read a few papers which suggested this, particularly for models with interactions between binary and continuous predictors.I indeed do have an interaction between each score and PE_Condition in the equation (e.g., PE_Condition*spq). – SilvaC Dec 14 '23 at 15:16
  • 2
    One model because it lets you control for the other variables. There shouldn't be a correlation between condition and the spq scores. If condition is, as you say, an "experimental condition" then people should be randomly assigned to the different conditions. – Peter Flom Dec 14 '23 at 15:26
  • Great, thank you so much. Apologies, I meant the interaction PE_Condition*(spq+pdi+caps). This is when I have problems with (multi)collinearity as the questionnaire scores are correlated. – SilvaC Dec 14 '23 at 15:31
  • Oh. I see. That makes sense. the interactions are colinear is what you are saying, right? – Peter Flom Dec 14 '23 at 16:38
  • Yes that's correct. I found the spq, pdi, caps scores are correlated. – SilvaC Dec 15 '23 at 10:46