26

I am trying to run a lme model with these data:

tot_nochc=runif(10,1,15)
cor_partner=factor(c(1,1,0,1,0,0,0,0,1,0))
age=runif(10,18,75)
agecu=age^3
day=factor(c(1,2,2,3,3,NA,NA,4,4,4))
dt=as.data.frame(cbind(tot_nochc,cor_partner,agecu,day))
attach(dt)

corpart.lme.1=lme(tot_nochc~cor_partner+agecu+cor_partner *agecu, 
                  random = ~cor_partner+agecu+cor_partner *agecu |day, 
                  na.exclude(day))

I get this error code:

Error in na.fail.default(list(cor_partner = c(1L, 1L, 2L, 1L, 1L, 1L, : missing values in object

I am aware there are similar questions in the forum. However, in my case:

  • cor_partner has no missing values;
  • the whole object is coded as a factor (at least from what the Global Environment shows).

I could exclude those NA values with an na.action, but I'd rather know why the function is reading missing values - to understand exactly what is happening to my data.

Ferdi
  • 530
  • 3
  • 12
  • 23
InverniE
  • 538
  • 1
  • 6
  • 21
  • Can you please include data and/or code that will provide us with a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) ? It's going to be hard to answer this question otherwise ... – Ben Bolker Jul 07 '16 at 16:17
  • @BenBolker Edited, thanks – InverniE Jul 07 '16 at 16:40
  • this looks like a typo/thinko to me. Can you explain what `na.exclude(day)` is supposed to be doing? I would generally do this by adding `day` to the data frame, then **not** using `attach()`, but instead using the combined data frame-including `day`- in the `data` argument ... ?? – Ben Bolker Jul 07 '16 at 16:54
  • also, in the data set you give there are only 8 values of `day`, and 10 values of all of the other variables, so I get a "variable lengths differ" error ... – Ben Bolker Jul 07 '16 at 16:56
  • This was an example matrix, they are not the data I am using. day is part of the dt matrix and has 10 values, including NAs, I have edited. – InverniE Jul 07 '16 at 17:10

4 Answers4

32

tl;dr you have to use na.exclude() (or whatever) on the whole data frame at once, so that the remaining observations stay matched up across variables ...

set.seed(101)
tot_nochc=runif(10,1,15)
cor_partner=factor(c(1,1,0,1,0,0,0,0,1,0))
age=runif(10,18,75)
agecu=age^3
day=factor(c(1,2,2,3,3,NA,NA,4,4,4))
## use data.frame() -- *DON'T* cbind() first
dt=data.frame(tot_nochc,cor_partner,agecu,day)
## DON'T attach(dt) ...

Now try:

library(nlme)
corpart.lme.1=lme(tot_nochc~cor_partner+agecu+cor_partner *agecu, 
              random = ~cor_partner+agecu+cor_partner *agecu |day, 
              data=dt,
              na.action=na.exclude)

We get convergence errors and warnings, but I think that's now because we're using a tiny made-up data set without enough information in it and not because of any inherent problem with the code.

Ben Bolker
  • 192,494
  • 24
  • 350
  • 426
  • 2
    Thanks, it works without any warning on the actual data. I thought that na.exclude(day) would automatically exclude the whole row based on the value in "day", not work at single column value, so good to know! – InverniE Jul 08 '16 at 13:40
15

randomForest package has a na.roughfix function that "imputes Missing Values by median/mode"

You can use it as follows

fit_rf<-randomForest(store~.,
        data=store_train,
        importance=TRUE,
        prOximity=TRUE,
        na.action=na.roughfix)
David Arenburg
  • 89,637
  • 17
  • 130
  • 188
kurapati
  • 161
  • 1
  • 2
5

if your data contain Na or missing values you can use this it will pass the data exactly the same as it is in datasets.

rf<-randomForest(target~.,data=train, na.action = na.roughfix)

Andrew Taylor
  • 3,340
  • 1
  • 22
  • 46
1

Another possible solution could be to use data <- na.omit(train) which will allow you to pass the data with ease.

patrickmdnet
  • 3,274
  • 1
  • 28
  • 31
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-ask). – Community Sep 22 '21 at 06:13