0

We created a table in R with values from the S&P500 and added rows like the simple 10 Days Moving Average. We set the NA-values to 0. Example:

myStartDate <- '2020-01-01' 
myEndDate   <- Sys.Date()
Dataset$SMA10 <- SMA(Dataset[,"Close"], 10) 
Dataset$SMA10 <- as.numeric(Dataset$SMA10)
Dataset$SMA10[is.na(Dataset$SMA10)] <- 0

Our goal is to create a random forest model. Therefore we split the data into a train and a valid data:

set.seed(100) 
train <- sample(nrow(Dataset), 0.5*nrow(Dataset), replace = FALSE) 
TrainSet <- Dataset [train,] 
ValidSet <- Dataset [-train,] 

Now if we want to generate the model with following code;

model1 <- randomForest(SMA10~.,data=TrainSet, mtry=5, importance=TRUE,ntree=500) 
print(model1) 

we get this error message: Error in x[, i] <- frame[[i]] : number of items to replace is not a multiple of replacement length

By looking up this error in the forum, we found that this is related with NA-Values. Therefore we are a little confused, because we have no NA-Values in our table. Can you tell us what we are doing wrong? Thank you very much in advance.

  • It looks like you're new to SO; welcome to the community! The answer to your question is probably specific to your data. To answer your question, make your question reproducible. Check it out: [making R reproducible questions](https://stackoverflow.com/q/5963269). – Kat May 09 '22 at 22:42
  • I get the feeling that your difficulty might live with your `size = 0.5*nrow(Dataset)`, if say, nrow is odd. – Chris May 10 '22 at 03:52

0 Answers0