[Rd] Bug in mclapply?
Winston Chang
winstonchang1 at gmail.com
Tue Dec 11 18:22:37 CET 2012
(Sorry for the repeat message; I forgot to send the previous message
in plain text.)
I've been using mclapply and have encountered situations where it
gives errors or returns incorrect results. Here's a minimal example,
which gives the error on R 2.15.2 on Mac and Linux:
library(parallel)
f <- function(x) NULL
mclapply(1, f, mc.preschedule = FALSE, mc.cores = 1)
# Error in sum(sapply(res, inherits, "try-error")) :
# invalid 'type' (list) of argument
I believe it happens when the following are true:
- The function returns NULL
- mc.preschedule = FALSE
- mc.cores >= length of the input data
Here are some examples I used to trace down the problem.
library(parallel)
f <- function(x) NULL
# Error when mc.preschedule=FALSE and mc.cores >= length(x)
mclapply(1, f, mc.preschedule = FALSE, mc.cores = 1) # Error
mclapply(1, f, mc.preschedule = FALSE, mc.cores = 2) # Error
mclapply(1:2, f, mc.preschedule = FALSE, mc.cores = 1) # OK
# In the following 2 cases, I get an error about 10-20% of the time.
# The other times, the result is worse: it returns a list with only one
# element, not two!
mclapply(1:2, f, mc.preschedule = FALSE, mc.cores = 2) # Error
mclapply(1:2, f, mc.preschedule = FALSE, mc.cores = 3) # Error
# When mc.preschedule=TRUE, always works
mclapply(1, f, mc.preschedule = TRUE, mc.cores = 1) # OK
mclapply(1:2, f, mc.preschedule = TRUE, mc.cores = 1) # OK
mclapply(1:2, f, mc.preschedule = TRUE, mc.cores = 2) # OK
# lapply() always works
lapply(1, f) # OK
lapply(1:2, f) # OK
lapply(1:2, f) # OK
# If function returns non-null, it works
g <- function(x) 0
mclapply(1, g, mc.preschedule = FALSE, mc.cores = 1) # OK
mclapply(1:2, g, mc.preschedule = FALSE, mc.cores = 1) # OK
mclapply(1:2, g, mc.preschedule = FALSE, mc.cores = 2) # OK
Digging around in mclapply(), I think it happens because
mccollect(jobs) is returning an empty list. But when I use
options(error=recover) and debug the function, I find that when I call
mccollect(jobs) again, it returns a list with values -- it's as though
mccollect() is returning too early. This will illustrate:
> mclapply(1, f, mc.preschedule = FALSE, mc.cores = 1)
Error in sum(sapply(res, inherits, "try-error")) :
invalid 'type' (list) of argument
Enter a frame number, or 0 to exit
1: mclapply(1, f, mc.preschedule = FALSE, mc.cores = 1)
Selection: 1
Called from: top level
Browse[1]> res
named list()
Browse[1]> res <- mccollect(jobs)
Browse[1]> res
$`12348`
NULL
The error happens on line 63 of mclapply.r, which is after `res <-
mccollect(jobs)` is called, on line 61. At this point, res should be a
named list with values filled in, but it's empty. When I run `res <-
mccollect(jobs)` again, it gives the correct values.
Is there a good way to work around this issue for now?
-Winston
More information about the R-devel
mailing list