[Rd] mccollect with NULL in R 3.6
Gergely Daróczi
d@rocz|g @end|ng |rom r@pporter@net
Sat May 4 01:20:23 CEST 2019
On Fri, May 3, 2019 at 4:34 PM Tomas Kalibera <tomas.kalibera using gmail.com> wrote:
>
> On 5/3/19 3:04 PM, Gergely Daróczi wrote:
> > On Thu, May 2, 2019 at 7:24 PM Tomas Kalibera <tomas.kalibera using gmail.com> wrote:
> >> On 5/1/19 12:25 AM, Gergely Daróczi wrote:
> >>> Dear All,
> >>>
> >>> I'm running into issues with calling mccollect on a list containing NULL
> >>> using R 3.6 (this used to work in 3.5.3):
> >>>
> >>> jobs <- lapply(
> >>> list(NULL, 'foobar'),
> >>> function(x) mcparallel(identity(x)))
> >>> mccollect(jobs, wait = FALSE, timeout = 0)
> >>> #> Error in names(res) <- pnames[match(s, pids)] :
> >>> #> 'names' attribute [2] must be the same length as the vector [1]
> >>>
> >>> Note, setting a "name" for the jobs does not help, but the above works with
> >>> "wait=TRUE", and also if I change the order of NULL and "foobar", although
> >>> in that case, the second value (NULL) is ommitted. It also works with
> >>> mclapply fine.
> >>>
> >>> Any ideas/suggestion on how to get mccollect work with the above example?.
> >> NULL is not a valid job identification. Perhaps mccollect() could give a
> >> clearer error message, but I don't see, given its documentation, what
> >> else than throwing an error it should do. What is the problem you were
> >> trying to solve?
> > Thank you very much for looking into this!
> >
> > What was interesting to me is that it used to work before 3.6 -- I
> > have a script iterating over a list of data frames to train models,
> > but it started to fail with the 3.6 release.
> >
> > The "NULL is not a valid job identification" problem doesn't seem to
> > stand for my production job, as each list element has a proper name,
> > but I think I can reproduce this with this minimal example as well:
> >
> > library(parallel)
> > jobs <- lapply(1:2, function(x) {
> > mcparallel(if (x == 1) NULL else x, name = as.character(x))
> > })
> > mccollect(jobs, wait = FALSE, timeout = 2)
> > #> Error in names(res) <- pnames[match(s, pids)] :
> > #> 'names' attribute [1] must be the same length as the vector [0]
> >
> > So the jobs have proper name, but the NULL return value is causing
> > problems. Note, that it only causes problems when the NULL value is
> > the first, eg switching 1 and 2 works, also running this on 1:3 and
> > returning NULL on 2 etc.
> Thanks for the clarification, I got confused by that you wrote about
> "calling mccollect on a list containing NULL", so I was answering a
> different question. Now I see you actually meant collecting results from
> jobs that return NULL, and even though that writing jobs that return
> NULL is incorrect, I see that there is really a bug that causes the R error.
> > Now, I'm aware that 7 months ago this was added to the docs at
> > https://github.com/wch/r-source/commit/f0d15be765dcf92e2349429428d49cd5b212abb4
> > that NULL should not be returned, so it seems to be a user error on my
> > end, but it seems to fail only when NULL is the first returned element
> > in mccollect, and working OK eg when NULL is the 2nd or other element
> > (although with side effects, eg missed elements).
> NULL has been reserved even well before that documentation update, and
> if it somewhat worked as a result, it had been a coincidence. The
> documentation change was only to make it more obvious, but it already
> said "‘mccollect’ will return ‘NULL’ for a terminating job that has sent
> its results already after which the job is no longer available."
> > So maybe failing with an explicit error message whenever mccollect
> > hits a NULL for consistency might help here instead of skipping
> > "delivered.result <- delivered.result + 1L" when the returned value is
> > not raw at https://github.com/wch/r-source/commit/f0d15be765dcf92e2349429428d49cd5b212abb4#diff-e634fbaed323aac88667e7826865b160R72
> > ? Or even better (at least for my use case), maybe allowing to return
> > NULL and throwing the warning on line 108 in that case.
>
> Thanks for the report, I've fixed mccollect() in R-devel to return NULL
> from the job. To be ported to R-pathed unless problems appear.
Thank you very much for the great news and your work!
>
> Best
> Tomas
>
> >
> > Thanks for considering this,
> > Gergely
>
> >
> >> Best
> >> Tomas
> >>
> >>> Thanks,
> >>> Gergely
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-devel using r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
>
More information about the R-devel
mailing list