0

I am trying to fit a model and view metrics of the model using tidymodels.

I have a very imbalanced data set so looking at these metrics is important for my model.

Here I create the folds:

library(yardstick)

# Creating cross validation folds
set.seed(123)
FA_folds <- vfold_cv(FA_train, v =5, strata = Class)

FA_folds

Then specify the metrics

FA_metrics <- metric_set(mn_log_loss, accuracy, sensitivity, specificity)

Creating a recipe

FA_rec <- recipe(Class ~., data = FA_train) %>%
                 step_dummy(all_nominal(), -Class) %>% 
                 step_novel(all_nominal_predictors()) %>%
                 step_other(all_nominal_predictors(), threshold = 0.01) %>%
                 step_unknown(all_nominal_predictors()) %>%
                 step_impute_median(all_numeric_predictors()) %>%
                 step_zv(all_predictors())

FA_rec

library(baguette)

FA_spec <-
  bag_tree(min_n = 10) %>%
  set_engine("rpart", times = 25) %>%
  set_mode("classification")

imb_wf <-
  workflow() %>%
  add_recipe(FA_rec) %>%
  add_model(FA_spec)

imb_fit <- fit(imb_wf, data = FA_train)
imb_fit

Here is when I get my error:

doParallel::registerDoParallel()
set.seed(123)

imb_rs <-
  fit_resamples(
    imb_wf,
    resamples = FA_folds,
    metrics = metric_set(mn_log_loss, accuracy, sensitivity, specificity))

collect_metrics(imb_rs)

I'm not sure why I keep getting this error but I would really appreciate some help!

  • 1
    It could be that you have NA values in your data set. Alternatively, it could be a lot of other possible problems. It looks like you're fairly new to SO; welcome to the community! If you want great answers quickly, please make this question reproducible. This includes sample data (e.g., data.frame(x=...,y=...) , like the output from dput(head(dataObject))). Check out this resource for great questions: [making R reproducible](https://stackoverflow.com/q/5963269). – Kat Jan 16 '22 at 16:55

0 Answers0