2

I am using the example code posted here to show a progress_bar (from the progress package) with doParallel + foreach. Solutions there however make use of doSNOW (e.g. code by Dewey Brooke that I am using for testing), which is more outdated than doParallel and returns this NOTE when building a package with CRAN flags:

Uses the superseded package: ‘doSNOW (>= 1.0.19)’

Change this does not seems that as straightforward as expected. If only registerDoSNOW is replaced by registerDoParallel, and .options.snow by .options.doparallel the code will run, but in the second case will not show any progress bar at all.

I think this might relate to the use of .options.X. This part of the code is very obscure to me, since .options.snow works indeed when using doSNOW, but there is no documentation of the foreach man page about the use of this argument. Therefore, it is not surprising that .options.doparallel does not work, since it was just a wild guess of mine.

Including the call to pb$tick() within the foreach loop will not work either, and will actually cause the result to be wrong. So I am really out of ideas on where should I include this in the code.

Where .options.snow comes from? where should go pb$tick(), how to show the progress_bar object using doParallel here?

I paste below the code (doSNOW replaced by doParallel) for convenience, but credit again the original source:

library(parallel)
library(doParallel)

numCores<-detectCores()
cl <- makeCluster(numCores)
registerDoParallel(cl)

# progress bar ------------------------------------------------------------
library(progress)

iterations <- 100                               # used for the foreach loop  

pb <- progress_bar$new(
  format = "letter = :letter [:bar] :elapsed | eta: :eta",
  total = iterations,    # 100 
  width = 60)

progress_letter <- rep(LETTERS[1:10], 10)  # token reported in progress bar

# allowing progress bar to be used in foreach -----------------------------
progress <- function(n){
  pb$tick(tokens = list(letter = progress_letter[n]))
} 

opts <- list(progress = progress)

# foreach loop ------------------------------------------------------------
library(foreach)

foreach(i = 1:iterations, .combine = rbind, .options.doparallel = opts) %dopar% {
  summary(rnorm(1e6))[3]
}

stopCluster(cl) 
elcortegano
  • 2,009
  • 10
  • 36
  • 50

1 Answers1

1

doParallel still uses the .options.snow argument for whatever reason. Found this little tidbit in the doParallel documentation.

The doParallel backend supports both multicore and snow options passed through the foreach function. The supported multicore options are 1st preschedule, set.seed, silent, and cores, which are analogous to the similarly named arguments to mclapply, and are passed using the .options.multicore argument to foreach. The supported snow options are preschedule, which like its multicore analog can be used to chunk the tasks so that each worker gets a prescheduled chunk of tasks, and attachExportEnv, which can be used to attach the export environment in certain cases where R’s lexical scoping is unable to find a needed export. The snow options are passed to foreach using the .options.snow argument.

foreach is powerful package but whoever is maintaining it makes odd decisions.

Dewey Brooke
  • 358
  • 2
  • 8