I have a very straight-forward question. THIS is my dataset
- I have NAs in both the outcome (y = "scores") and in the continuos predictor (X1_c):
## Missing data for the outcome (Y = "SCORE")
### in each value of the categorical outcome (X2)
data %>%
dplyr::select(PARTICIPANT, SCORE, X2) %>%
group_by(X2) %>%
count(is.na(SCORE))
outcome:
X2 is.na(SCORE) n
<fct> <lgl> <int>
1 Test1 FALSE 106
2 Test1 TRUE 12
3 Test2 FALSE 100
4 Test2 TRUE 18
Missing data for continuous predictor (X1_c)
data %>%
count(is.na(data$X1_c))
outcome:
is.na(data$X1_c)` n
<lgl> <int>
1 FALSE 160
2 TRUE 76
mod1 <- lmer(SCORE ~ X1_c * X2 + (1|PARTICIPANT), data = data)
- My question is:
1 I know that Lmer does listwise deletion of NAS, but should I do it pairwise instead? (ie, delete all participants who lack a value in both Y and X1_c) ?
if not, then:
2 Should I delete the observations only in the outcome Y (Scores, n = 76 NAS) ?
I've seen a lot of posts here (like this , this , and this ) and elsewhere concerning NAS and regression, but I got lost with many different answers. Thanks in advance! Any thoughts would be much appreciated.
EDIT What NAs mean here?
Y: scores in two tests/exams (X2). Each participant should have 2 different scores (1 for test 1 and 1 for test 2)
X1_c : score in another exam (which acts as a moderator). Each participant should have a repeated score. I wanna see if this score interacts with the two other tests (X2)
Then, NAs mean that the participant in question did not take one of the tests. So,
X2 `is.na(SCORE)` n
<fct> <lgl> <int>
1 Test1 FALSE 106
2 Test1 TRUE 12 # did not take test 1 (no score on this)
3 Test2 FALSE 100
4 Test2 TRUE 18 # did not take test 2 (no score on this)
and:
is.na(data$X1_c)` n
<lgl> <int>
1 FALSE 160
2 TRUE 76 # did not take test X1 (the moderator)
The data look like this:
> head(data)
# A tibble: 6 x 5
PARTICIPANT SCORE X1_c X2
<int> <int> <dbl> <fct>
1 1 21 -42.9 Test1
2 1 NA -42.9 Test2 #didn't take Test2, took test 1 n' X1
3 2 21 18.9 Test1
4 2 20 18.9 Test2
5 3 15 NA Test1 #didn't take X1_c, took test1 n' test2
6 3 17 NA Test2
bonus: I was having some normality issues and I thought that removing NAs could solve that, but it didn't, any thoughts?

